File name with brackets into it

Jun 5, 2009 at 2:21 PM

ImapMessageBodyPart Constructor is raising "Invalid format could not parse body part headers." error because the attached file name has brackets "()" into it, I have attached the data below.

("application" "pdf" ("name" "This is testing with (word).pdf") NIL "This is testing with (word).pdf" "base64" 433238 NIL ("attachment" ("filename" "This is testing with (word).pdf" "size" "316659" "creation-date" "Fri, 05 Jun 2009 19:26:46 GMT" "modification-date" "Fri, 05 Jun 2009 19:26:46 GMT")) NIL NIL)

Looks like we need to do some modifications in "attachment" regex

Can you please help me with the same.

Thanks
Ajay Sawant

Jun 16, 2009 at 9:23 AM

I have encountered the same problem.

For now, I have use this regex

^\\((?(\"[^\"]*\"|NIL))\\s(?(\"[^\"]*\"|NIL))\\s(?(?>\\((?)|\\)(?<-LEVEL>)|(?!\\(|\\)).)+(?(LEVEL)(?!))|NIL)\\s(?(\"[^\"]*\"|NIL))\\s(?(\"[^\"]*\"|NIL))\\s(?(\"[^\"]*\"|NIL))\\s(?(\\d+|NIL))\\s((?(.*))\\s|)(?(\"[^\"]*\"|NIL))\\s(?((?>\\((?)|\\)(?<-LEVEL>)|(?!\\(|\\)).)+(?(LEVEL)(?!))|NIL))\\s(?(\"[^\"]*\"|NIL))\\)$
But it work only if there is both open and close brackets (ex. "This is testing with (word.pdf" don't match)
The same problem are present in the attachment section (I have copy the regex rule from there)
Any RegEX guru that lead us to the right way? :P
Jun 22, 2009 at 6:48 PM
Edited Jun 22, 2009 at 6:49 PM

After some regex studying... 

string attachment = "^\\((?<type>(\"[^\"]*\"|NIL))\\s(?<subtype>(\"[^\"]*\"|NIL))\\s(?<attr>(\\(.*?\\)|NIL))\\s(?<id>(\"[^\"]*\"|NIL))\\s(?<desc>(\"[^\"]*\"|NIL))\\s(?<encoding>(\"[^\"]*\"|NIL))\\s(?<size>(\\d+|NIL))\\s((?<data>(.*))\\s|)(?<lines>(\"[^\"]*\"|NIL))\\s(?<disposition>(\\(.*?\\)|NIL))\\s(?<lang>(\"[^\"]*\"|NIL))\\)$";

This time it work in any case. 

Sep 24, 2009 at 9:35 AM
Edited Sep 24, 2009 at 9:36 AM

This must be better, 'cause your expression do not match for this: "(\"text\" \"plain\" (\"charset\" \"koi8-r\") NIL NIL \"quoted-printable\" 3511 93 NIL NIL \"en-US\" NIL)";

string attachment = "^\\((?<type>(\"[^\"]*\"|NIL))\\s(?<subtype>(\"[^\"]*\"|NIL))\\s(?<attr>(\\(.*?\\)|NIL))\\s(?<id>(\"[^\"]*\"|NIL))\\s(?<desc>(\"[^\"]*\"|NIL))\\s(?<encoding>(\"[^\"]*\"|NIL))\\s(?<size>(\\d+|NIL))\\s((?<data>(.*))\\s|)(?<lines>(\"[^\"]*\"|NIL))\\s(?<disposition>((?>\\((?<LEVEL>)|\\)(?<-LEVEL>)|(?!\\(|\\)).)+(?(LEVEL)(?!))|NIL))\\s(?<lang>(\"[^\"]*\"|NIL))\\)$"

Oct 28, 2009 at 6:15 PM

I am having this same issue. Is there any consensus on what the updated regex should be? I don't have the time to try to comprehend the regexes from the last two posts. Could someone give a breakdown of the differences.

 

@mavex83 said that the following worked in any case:

^\\((?<type>(\"[^\"]*\"|NIL))\\s(?<subtype>(\"[^\"]*\"|NIL))\\s(?<attr>(\\(.*?\\)|NIL))\\s(?<id>(\"[^\"]*\"|NIL))\\s(?<desc>(\"[^\"]*\"|NIL))\\s(?<encoding>(\"[^\"]*\"|NIL))\\s(?<size>(\\d+|NIL))\\s((?<data>(.*))\\s|)(?<lines>(\"[^\"]*\"|NIL))\\s(?<disposition>(\\(.*?\\)|NIL))\\s(?<lang>(\"[^\"]*\"|NIL))\\)$"

But then @mr_squall gave the following, saying it would work better:

^\\((?<type>(\"[^\"]*\"|NIL))\\s(?<subtype>(\"[^\"]*\"|NIL))\\s(?<attr>(\\(.*?\\)|NIL))\\s(?<id>(\"[^\"]*\"|NIL))\\s(?<desc>(\"[^\"]*\"|NIL))\\s(?<encoding>(\"[^\"]*\"|NIL))\\s(?<size>(\\d+|NIL))\\s((?<data>(.*))\\s|)(?<lines>(\"[^\"]*\"|NIL))\\s(?<disposition>((?>\\((?<LEVEL>)|\\)(?<-LEVEL>)|(?!\\(|\\)).)+(?(LEVEL)(?!))|NIL))\\s(?<lang>(\"[^\"]*\"|NIL))\\)$

What's the difference?

 

Thanks!

Oct 28, 2009 at 6:17 PM

One further point to make is that I downloaded the most recent version of the source and the regex has not been updated to handle file names with parentheses. How could it be that there hasn't been enough issues to warrant an update in the code?

Oct 28, 2009 at 7:48 PM

Well I had no choice but to look more thoroughly through the 2 variations and seemed to have answered my own question.

In the 1st one, the named capturing group <disposition> looks much like all the other capturing groups:  (\(.*?\)|NIL)

But if you look at the regex in the original codebase the <disposition> group is much more detailed than this. And the 2nd regex, posted by @mr_squall simply contains these details: ((?>\((?<LEVEL>)|\)(?<-LEVEL>)|(?!\(|\)).)+(?(LEVEL)(?!))|NIL)

Hope this helps. Now can we get the src changed!?