Handle lines longer than 1000 bytes: error "bad mail input format" #18

blueyed · 2014-01-11T06:00:12Z

Currently dma chokes on mail input which contains lines longer than 1000 bytes, because of the buffer being used.
When it does not find a newline at the end, it adds one, but then assumes/requires this to be the last line.

The error is "bad mail input format".

This might happen when using dma to handle web application errors during development / local setup, and the included debug information contains huge environment etc settings (e.g. "LS_COLORS").

I think that it would be saner to not require RFC2822 compliance, but accept the mail nonetheless.

The code in question: https://github.com/corecode/dma/blob/master/mail.c#L376

corecode · 2014-01-11T22:06:58Z

Do you have a suggestion how to deal with longer lines? Something that
does not complicate the code?

On 01/11/2014 07:00 AM, Daniel Hahler wrote:

Currently dma chokes on mail input which contains lines longer than
1000 bytes, because of the buffer being used.
When it does not find a newline at the end, it adds one, but then
assumes/requires this to be the last line.

The error is "bad mail input format".

This might happen when using dma to handle web application errors
during development / local setup, and the included debug information
contains huge environment etc settings (e.g. "LS_COLORS").

I think that it would be saner to not require RFC2822 compliance, but
accept the mail nonetheless.

The code in question:
https://github.com/corecode/dma/blob/master/mail.c#L376

—
Reply to this email directly or view it on GitHub
#18.

blueyed · 2014-01-11T22:15:56Z

The easiest approach would probably be to increase the buffer length, from 1000 bytes to maybe 64kB (65536).
It would/could then still fail in the same way, but it's much more unlikely.

Or, just do not add a newline (and effectively aborting / cutting the input) and keep reading in 1000 byte chunks.

corecode · 2014-01-12T10:17:20Z

The RFC is absolutely clear on this issue:

Each line of characters MUST be no more than 998 characters

and
However, there are so many implementations that (in compliance with
the transport requirements of [RFC5321
http://tools.ietf.org/search/rfc5321]) do not accept messages
containing more than 1000 characters including the CR and LF per line,
it is important for implementations not to create such messages.

It is clear that we can not forward any message with more than 1000
characters per line. What do you suggest as alternative? Insert a
linebreak after 998 characters?

On 01/11/2014 11:15 PM, Daniel Hahler wrote:

The easiest approach would probably be to increase the buffer length,
from 1000 bytes to maybe 64kB (65536).
It would/could then still fail in the same way, but it's much more
unlikely.

Or, just do not add a newline (and effectively aborting / cutting the
input) and keep reading in 1000 byte chunks.

—
Reply to this email directly or view it on GitHub
#18 (comment).

blueyed · 2014-01-12T17:01:30Z

Well, I think that dma is not creating the message, but only transports it.

The bug is really with the app that hands it over to dma.

Inserting newlines can be very problematic, if the message is meant to contain (encoded) data, and the result may fail to parse afterwards.

Imagine an automated message, with "key=value" pairs on separate lines, where it may result in "key=verylongvalue\r\nstillvalue".

However, breaking lines after 998 bytes is still far better than rejecting the mail altogether.

I can imagine having a configuration option (in dma.conf) for this: RFC2822_SPLITLINES.
If set, CRLF gets inserted after 998 bytes, otherwise the lines would get bypassed unchanged.

corecode · 2014-01-12T17:04:57Z

It is not optional to forward messages with more than 998 characters per
line: it is forbidden. Do you know how other MTAs handle this? I only
see two options: break line, or reject (bounce).

On 01/12/2014 06:01 PM, Daniel Hahler wrote:

Well, I think that dma is not creating the message, but only
transports it.

The bug is really with the app that hands it over to dma.

Inserting newlines can be very problematic, if the message is meant to
contain (encoded) data, and the result may fail to parse afterwards.

Imagine an automated message, with "key=value" pairs on separate
lines, where it may result in "key=verylongvalue\r\nstillvalue".

However, breaking lines after 998 bytes is still far better than
rejecting the mail altogether.

I can imagine having a configuration option (in dma.conf) for this:
RFC2822_SPLITLINES.
If set, CRLF gets inserted after 998 bytes, otherwise the lines would
get bypassed unchanged.

—
Reply to this email directly or view it on GitHub
#18 (comment).

blueyed · 2014-01-12T18:40:08Z

Then breaking the line is better than rejecting the mail.

Postfix has a line_length_limit option, with a default of 2048 (http://www.postfix.org/postconf.5.html):

Upon input, long lines are chopped up into pieces of at most this length; upon delivery, long lines are reconstructed.

It sounds like this will effectively pass through long lines?!

bcoca · 2014-03-26T19:24:41Z

@blueyed, IIRC that limit is for internal processing across Postfix apps and filters, for actual smtp this is the setting:
http://www.postfix.org/postconf.5.html#smtp_line_length_limit

corecode · 2014-03-26T19:26:30Z

Good catch. Any suggestions how we should handle this?

On 03/26/2014 08:24 PM, Brian Coca wrote:

@blueyed https://github.com/blueyed, IIRC that limit is for internal
processing across Postfix apps and filters, for actual smtp this is
the setting:
http://www.postfix.org/postconf.5.html#smtp_line_length_limit

—
Reply to this email directly or view it on GitHub
#18 (comment).

bcoca · 2014-03-26T19:47:08Z

I think rejection is OK, but maybe add more info to the message, 'bad mail input format' (I just ran into this) is a bit too succinct, knowing that its a line length error helps pinpoint the issue much faster. Also, quoting RFC helps convince programmers change their code vs demand that the MTA just accept their emails.

corecode · 2014-03-26T19:48:10Z

alternatively we could force a linebreak when accepting the mail, like
postfix does when sending?

On 03/26/2014 08:47 PM, Brian Coca wrote:

I think rejection is OK, but maybe add more info the the message, 'bad
mail input format' (I just ran into this) is a bit too succinct,
knowing that its a line length error helps pinpoint the issue much
faster. Also, quoting RFC helps convince programmers change their code
vs demand that the MTA just accept their emails.

—
Reply to this email directly or view it on GitHub
#18 (comment).

blueyed · 2014-03-27T07:18:27Z

The Postfix approach seems sensible.
To me everything is better than rejecting the mail.

quoting RFC helps convince programmers change their code

I agree. There could be a warning instead of the current error.

Given the use case of dma (for personal or lightweight use in containers) I do not think it should be too strict (i.e. rejecting the mail).

bcoca · 2014-03-27T14:52:20Z

I disagree, I use dma because it it simple and lightweight and doesn't try to do complex stuff like line splitting with character or armor encodings. If it accepts mail and it then gets rejected by another mailer, it creates more issues than it solves.

I am a fan of fail early and fail loudly, if you start accepting emails and push the problem downstream you never have an incentive to fix the actual issue, the program composing non compliant smtp messages.

Ref: corecode#18

Ref: #18

emaste · 2016-03-24T20:22:05Z

This is one issue that prevented the FreeBSD clusteradm team switching to DMA internally: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=208261

bcoca · 2016-03-24T20:39:50Z

in postfix you CAN disable the validation, does not mean it doesn't do it, postfix has many options to 'tolerate' other broken implementations

corecode · 2016-03-26T20:06:08Z

@emaste what do you suggest dma should do? Just break lines?

emaste · 2016-06-17T16:06:53Z

@corecode I want to see the FreeBSD commit mail generator (I think it was the problem) fixed to not generate lines > 998 chars, and I think I agree that by default dma shouldn't accept such mail.

What do you think about having a config file option to set a (non-compliant) maximum line length, and/or one to enable line splitting?

corecode · 2016-06-17T16:19:57Z

I'm fine with not rejecting mails by default - I just don't know what we should do. Split the line? Cut it? Pass it and be non-compliant?

emaste · 2016-06-17T16:46:17Z

I don't like just cutting it (silently losing information in an otherwise passed email). Given the choice of passing non-compliant lines and splitting I'd choose the latter.

That leaves us with the question of doing it by default or only under a config file option. If I'm doing the work I wouldn't bother with the config option, but wouldn't object to one to appeal to the "fail early and fail loudly" by default mentioned by @bcoca.

corecode · 2016-06-17T17:12:45Z

I don't think we can easily add to the headers during reading. I'm fine with making the default to split lines. We'll need a strategy for splitting possibly overlong header lines tho, so that they stay being headers. Can we just split words in the middle, or do we have to split on word boundaries?

emaste · 2016-06-17T17:42:19Z

For me I consider the line splitting to be a "best effort" to deliver malformated mail, and I'd be happy enough splitting at whatever point the we reach the line length limit. I'd also be fine rejecting overlong headers and splitting only the body, if that's easier.

ttw · 2016-07-29T16:57:57Z

I believe the fundamental problem is context; currently there is none. Since 'dma' is, by design, a local MTA (i.e. non-listening) we should readily be able to distinguish from mail we are generating (i.e. out-going mail) and mail we are consuming (i.e. incoming for local delivery); since we are not relaying. I believe the existing behaviour is correct for out-going mail. For incoming mail we should be tolerant and pass the problem to the mail client. I do not believe that mangling email is a valid option as upstream mail signatures will then be incorrect.

To clarify, my own use case is using 'dma' as an out-going MTA and an incoming MDA (through fetchmail).

... had quick look at the code in relation to the above ... I think the initial proposal is correct in that we need to increase the 'readmail' buffer to a maximum (say 64k) and then add a check which will flag the mail as malformed if any line is over 1000 bytes (possibly through a generic 'int qitem.valid' which can map to various validity checks). We can then defer the actual decision until we actually 'deliver' the mail.

PS: you'll probably need a '#define RFC822_MAX_LINE 1000' and a '#define DMA_MAX_LINE' as is see those 'char[1000]' limits in a couple more places.
PPS: I can create a patch if the proposal is acceptable.

emaste · 2016-07-29T17:25:24Z

It's the outgoing mail side of this issue that affected the FreeBSD commit mail generator. It is true that the script should not generate long lines, but it's handled by the current MTA.

ttw · 2016-07-29T17:40:41Z

@emaste : I can only comment personally to that; firstly, I agree, "the script should not generate long lines". Secondly, we should not emit invalid output (i.e. lines over the limit). Thirdly, splitting content lines presumes a great deal about the content and, whilst it may not impact the FreeBSD output it may malform other content; reducing that problem is a sisyphean task.

In short (and IMHO), FreeBSD is correct it's assessment that 'dma' is an unsuitable mailer for its toolchain (because it produces illegal output).

ttw · 2016-07-30T03:41:06Z

Pull request is my best understanding of sane behaviour. It was actually a little simpler than I imagined since we queue the mail anyway; by defining different buffer lengths and only using the RFC length for remote delivery we can bump the error.

That said, upon reviewing the RFC I'm going to have to disagree with @corecode; the RFC does not prohibit lines longer than 1000 characters. IMHO it prohibits the production of messages which have lines longer than 1000 characters (e.g. the FreeBSD toolchain example we're discussing), however, it actually recommends that an MTA "handle an arbitrarily large number of characters". As such, there is probably a bit more work to do on the problem.

PS: I also forgot to patch the line termination so it actually has a valid CRLF, rather than an invalid LF terminator (semantic issue I know but I'll try and do that anyway).

For mails where the body does not respect RFC2822, split at the arbitrary length of 990 This should address corecode#18

For mails where the body does not respect RFC2822, try to split by words finding the last space before 1000's character If no spaces are found then consider the mail to be malformed anyway This should address corecode#18

Jamie-Landeg-Jones · 2022-02-01T18:14:01Z

I don't like the idea of message content being altered - at least, not without warning.

Of course, allowing lines over 1000 characters might cause a message to be corrupted somewhere else down the line.

Surely the clean solution would be to convert to quoted-printable? though this may be a pedantic overkill solution, especially for locally delivered mail.

corecode · 2022-02-01T18:27:02Z

That would change the whole whole content, I don't think we should take that liberty.

Jamie-Landeg-Jones · 2022-02-03T00:26:06Z

Fair enough.

I know I've not been involved with this discussion or development, so I'm not arrogant enough to expect my opinion to matter, but for what it's worth, if I was doing this, I'd be uncomfortable in breaking the spec (as you also are), but I'd also be uncomfortable altering the body in (what to those outside DMA) is an arbitrary manner.

Other applications that do that have caused unexpected (and sometimes unnoticed) errors, as exampled here:

https://stackoverflow.com/questions/9999908/php-mail-function-randomly-adds-a-space-to-message-text

https://stackoverflow.com/questions/10401863/how-to-send-a-csv-attachment-with-lines-longer-than-990-characters

Just my 1 cents worth

Cheers, Jamie

corecode · 2022-02-03T00:44:59Z

How is DMA breaking the spec?

emaste · 2022-02-03T01:31:20Z

@bapt added a patch to the FreeBSD base system for this, in freebsd/freebsd-src@b0b2d05 (and a bugfix from me in freebsd/freebsd-src@1a0dde3). This is slightly different from the proposal in #49, in that it does not attempt to split on whitespace. It does not do any splitting on headers.

IMO there are a few improvements that could be made:

split headers as well
split at whitespace if possible
make long line splitting a configurable option (reject mail with long lines, if not enabled)

corecode · 2022-02-03T01:54:15Z

@emaste I think an error in getline will get undetected and return success. Otherwise looks like an acceptable way to address such mails.

Jamie-Landeg-Jones · 2022-02-03T02:39:37Z

How is DMA breaking the spec?

It's not. There were suggestions earlier in this thread saying you should just ignore the 1000 character limit. You disagreed, and I agreed with you.

emaste · 2022-02-03T18:50:51Z

@corecode getline returns -1 on error or EOF, I think the feof/getline loop is non-canonical but I think checking getline's return for >0 is the appropriate condition.

I think this should go on top of @bapt's change:

diff --git a/contrib/dma/mail.c b/contrib/dma/mail.c
index 9e00c22d71db..0462ec1665ee 100644
--- a/contrib/dma/mail.c
+++ b/contrib/dma/mail.c
@@ -405,10 +405,7 @@ readmail(struct queue *queue, int nodot, int recp_from_header)
        if ((ssize_t)error < 0)
                return (-1);
 
-       while (!feof(stdin)) {
-               newline[0] = '\0';
-               if ((linelen = getline(&line, &linecap, stdin)) <= 0)
-                       break;
+       while ((linelen = getline(&line, &linecap, stdin)) > 0)
                if (had_last_line)
                        errlogx(EX_DATAERR, "bad mail input format:"
                                " from %s (uid %d) (envelope-from %s)",

emaste · 2022-02-03T19:05:12Z

@corecode ah, prior to bapt's change we had:

		if (fgets(line, sizeof(line) - 1, stdin) == NULL)
			break;

so same amount of error checking.

I think we can just check errno after the loop,

diff --git a/contrib/dma/mail.c b/contrib/dma/mail.c
index 0462ec1665ee..e052d745398c 100644
--- a/contrib/dma/mail.c
+++ b/contrib/dma/mail.c
@@ -507,8 +507,8 @@ readmail(struct queue *queue, int nodot, int recp_from_header)
                        }
                }
        }
-
-       ret = 0;
+       if (errno == 0)
+               ret = 0;
 fail:
        free(line);
        return (ret);

corecode · 2022-02-03T19:17:18Z

I think we can keep the feof() loop condition, but just do something about error return from getline.

emaste · 2022-02-03T19:36:18Z

I think it's fine to use getline()'s -1 return

I have a FreeBSD review in https://reviews.freebsd.org/D34159

ant5 · 2023-06-21T10:32:41Z

Hello!

After switching to dma for local delivery and few monthes of things going well I've got an error "bad mail input format" with one email message.

Does I understand right that there is a patch of cure and this patch is only in FreeBSD 13+ source tree?

How can I cure this?
Does I need to extract somehow this patch(es) from FreeBSD source and create custom port for mail/dma then add those patches to the custom port and then make custom package?

Can someone guide me to eliminate future accidents?

P.S. My system is FreeBSD 12-STABLE. Dma is mostly invoked by sieve scripts for coping/forwarding.

corecode · 2023-06-21T10:34:34Z

Are you submitting an incorrectly formatted message? The correct fix would be to only submit correctly formatted messages.

ant5 · 2023-06-21T10:52:14Z

One user (who is in company management) sended email to other users. Email message seems very casual.

But then he got email back:
I'm sorry to have to inform you that your message could not be delivered to one or more recipients. ... <[X@Z](mailto:X@Z)>: host 127.0.6.1[127.0.6.1] said: 451 4.2.0 <[X@Z](mailto:X@Z)> Execution of Sieve filters was aborted due to temporary failure (in reply to end of DATA command) ...
for those users who have configured copying/redirecting via sieve scripts.

The log say:
Jun 20 21:49:37 mail dma[e98a03][8169]: bad mail input format: from pop (uid 68) (envelope-from X@Z)

corecode · 2023-06-21T10:55:50Z

I would need to see the message to understand what happened. With the log messages it is not clear what happened. How are pop, sieve, dma, and the user related?

ant5 · 2023-06-21T11:05:26Z

Technicaly speaking, user work on workstation with some email client. Email client sent message to postfix via SMTP. Postfix relay message to dovecot via LMTP. Dovecote execute sieve script. Sieve script invoke mailwrapper (sendmail). Mailwrapper execute dma.

I have this message extracted from bounce email but I don't know is it in virgin form and will it raise an error.

corecode · 2023-06-21T11:12:32Z

That is a complicated setup. Why are you using dma and not postfix for dovecot for delivery? Without the original mail delivered to dma we don't know what is going on - it seems that it passed through several MTAs, so the assumption should be that the mail was formatted correctly. This pretty much rules out a 1000+ byte header line.

ant5 · 2023-06-21T11:15:45Z

Why are you using dma and not postfix for dovecot for delivery?

They all in separate jails. Dma in dovecot jail will relay to postfix in other jail (host).

ant5 · 2023-06-21T11:18:29Z

I'll try to invoke dma and feed it problematic message...
But it's not so simple for me.

ant5 · 2023-06-21T11:43:25Z

Yes. The message gives an error.

Zipped and attached:
< deleted confidentional data >

corecode · 2023-06-21T16:47:09Z

Thank you. The problem is that one line is exactly 1000 bytes long including the \r\n.

haarp · 2024-01-04T16:19:29Z

Wouldn't a config option setting the max buffer size/line length be the easiest solution here? Yes, it would break the spec, but it would be up to the admin to decide that. And for local-only delivery, there are no other servers further down the line that might choke on it.

Splitting is also a decent solution, but not always desireable. Some content, like code or logs shouldn't be split, to retain proper syntax.

Relegating this to the mail producer is also not always an option. They can be unpredictable. Maybe there was an error during a cronjob and it prints an unpredictably long line. The admin would get no mail, and very likely not realize.

corecode · 2024-01-06T02:01:29Z

we definitely don't want to break the spec. if there is a line >1000 chars, we either have to split it or reject it outright. we can't pass on an invalid message.

kgcgs · 2024-09-11T19:51:58Z

The only correct thing to do here is mimic the behavior of the fully featured MTAs. Sendmail has always split long lines in order to enforce compliance in messages that it creates but happily passes messages with long lines through SMTP without molesting them. Failure to address this in dma is just going to lead to messages being lost unexpectedly which is how most of us probably got here in the first place. Seems like an easy fix? (Even if it's a part of the RFC that's likely completely unnecessary to respect.)

blueyed mentioned this issue Jan 11, 2014

"bad mail input format" with dma / error not raised perenecabuto/django-sendmail-backend#1

Open

blueyed added a commit to blueyed/dma that referenced this issue Sep 22, 2014

Add more information to "bad mail input format" error

3e176d9

Ref: corecode#18

blueyed mentioned this issue Sep 22, 2014

Add more information to "bad mail input format" error #24

Closed

corecode pushed a commit that referenced this issue Sep 22, 2014

Add more information to "bad mail input format" error

b1056e4

Ref: #18

ttw mentioned this issue Jul 30, 2016

Extend 1000 character limit for mail receipt #42

Closed

ttw mentioned this issue Jul 30, 2016

'readmail' perverts the final terminator #43

Open

bapt added a commit to bapt/dma that referenced this issue Oct 21, 2016

Split body of mails not respecting RFC2822

e55c867

For mails where the body does not respect RFC2822, split at the arbitrary length of 990 This should address corecode#18

bapt added a commit to bapt/dma that referenced this issue Oct 21, 2016

Split body of mails not respecting RFC2822

a4a30d7

For mails where the body does not respect RFC2822, split at the arbitrary length of 990 This should address corecode#18

Amooti73 mentioned this issue Nov 6, 2021

Permission denied message in Debian 11 leahneukirchen/mblaze#216

Closed

mslehto mentioned this issue Sep 21, 2022

Error "remote delivery failed: corrupted queue file" with RT submission #110

Closed

haarp mentioned this issue Jan 3, 2024

cron fails to send email #88

Closed

Handle lines longer than 1000 bytes: error "bad mail input format" #18

Handle lines longer than 1000 bytes: error "bad mail input format" #18

Comments

blueyed commented Jan 11, 2014

corecode commented Jan 11, 2014

blueyed commented Jan 11, 2014

corecode commented Jan 12, 2014

blueyed commented Jan 12, 2014

corecode commented Jan 12, 2014

blueyed commented Jan 12, 2014

bcoca commented Mar 26, 2014

corecode commented Mar 26, 2014

bcoca commented Mar 26, 2014

corecode commented Mar 26, 2014

blueyed commented Mar 27, 2014

bcoca commented Mar 27, 2014

emaste commented Mar 24, 2016

bcoca commented Mar 24, 2016

corecode commented Mar 26, 2016

emaste commented Jun 17, 2016 • edited Loading

corecode commented Jun 17, 2016

emaste commented Jun 17, 2016

corecode commented Jun 17, 2016

emaste commented Jun 17, 2016

ttw commented Jul 29, 2016 • edited Loading

emaste commented Jul 29, 2016

ttw commented Jul 29, 2016 • edited Loading

ttw commented Jul 30, 2016

Jamie-Landeg-Jones commented Feb 1, 2022 • edited Loading

corecode commented Feb 1, 2022

Jamie-Landeg-Jones commented Feb 3, 2022 • edited Loading

corecode commented Feb 3, 2022

emaste commented Feb 3, 2022

corecode commented Feb 3, 2022

Jamie-Landeg-Jones commented Feb 3, 2022

emaste commented Feb 3, 2022

emaste commented Feb 3, 2022

corecode commented Feb 3, 2022

emaste commented Feb 3, 2022

ant5 commented Jun 21, 2023

corecode commented Jun 21, 2023

ant5 commented Jun 21, 2023

corecode commented Jun 21, 2023

ant5 commented Jun 21, 2023

corecode commented Jun 21, 2023

ant5 commented Jun 21, 2023

ant5 commented Jun 21, 2023

ant5 commented Jun 21, 2023 • edited Loading

corecode commented Jun 21, 2023

haarp commented Jan 4, 2024

corecode commented Jan 6, 2024

kgcgs commented Sep 11, 2024

emaste commented Jun 17, 2016 •

edited

Loading

ttw commented Jul 29, 2016 •

edited

Loading

ttw commented Jul 29, 2016 •

edited

Loading

Jamie-Landeg-Jones commented Feb 1, 2022 •

edited

Loading

Jamie-Landeg-Jones commented Feb 3, 2022 •

edited

Loading

ant5 commented Jun 21, 2023 •

edited

Loading