Wednesday, October 16, 2013

Email::Valid Peculiarities

Over here at Dada Mail, we do a lot of verification of email addresses when confirming subscriptions. Is the form of the email correct? Has this address been used to try to subscribe with before? Has the user themselves confirmed their subscription? What happens when there's an error? There's a lot to it (sadly) and if something in one of these steps goes wrong, a subscription doesn't happen, or worse yet, a mailing list is tarnished with not-so-valid addresses. This little tale is about invalid in form addresses getting on people's mailing lists.

A user posted on the Dada Mail support forums: "There's an address with a space in it, that's entered my list! It's causing problems with delivery of the ENTIRE list? How did that happen?"

A lot of the time developing a very complicated piece of software is supporting it (I think, anyways). When a question like this comes down the tubes, I have no great answer, except it must have been an outside thing. So I ask, "Was the address added directly into the backend?" That's something that happens from time to time, although there are many pitfalls: importing data can be done incorrectly - and even if it's done correctly once, the design of the backend can change, and then it's not done correctly anymore.

That's why, if it's an option, it's a good idea to use the API that presented to the developer, rather than try to retinker it, yourself. 

When the user replied back with, "Nope!" I knew I was really stuck.  "But I have tests!"  I protested to my Macbook Pro. "Coverage!" "I didn't roll my own email validator!" "I did everything you're supposed to do!"

 So, current tests be-damned, I had to start from square one: Attempt to recreate the problem, in the simplest way I could. So, I tried to join a list, with an address with a space in it, like this:

myaddress@example .com

And, wouldn't you know it, the app took the address like nothing was wrong. Now,  totally believing my user, but being in total disbelief myself, it was time to hit the code.  We use Email::Valid in Dada Mail, to test the validity of an email address, like so:



#!/usr/bin/perl 

use Email::Valid; 

my $address = 'someone@example.com'; 

if(Email::Valid->address($address)){ 
 print "Valid!"; 
}
else { 
 print "NOPE!"; 
} 
# prints, Valid! 


This is wrong.  Observe:


#!/usr/bin/perl 

use Email::Valid; 

my $address = 'someone@example .com'; # notice the space! 

if(Email::Valid->address($address)){ 
 print "Valid!"; 
}
else { 
 print "NOPE!"; 
}
# Prints, Valid!

And, there's my problem, in a pretty simple case.

So, what's going on? Basically my code had assumed that Email::Valid's address method returned a boolean value. It does not. From Email::Valid's docs (emphasis is mine):
This is the primary method which determines whether an email address is valid. It's behavior is modified by the values of mxcheck(), tldcheck(), local_rules(), fqdn(), and fudge(). If the address passes all checks, the (possibly modified) address is returned as a string. Otherwise, the undefined value is returned. In a list context, the method also returns an instance of the Mail::Address class representing the email address.
Big difference! So let's try that code snippet, again:

 
#!/usr/bin/perl 

use Email::Valid; 

my $address = 'someone@example .com'; # notice the space! 

my $validated_address = undef; 
if($validated_address = Email::Valid->address($address)){ 
 print "Valid!: $validated_address"; 
}
else { 
 print "NOPE!"; 
}
# prints, Valid!: someone@example.com

Ah-ha: Email::Valid will take it upon itself to modify the email address for you - but only in certain circumstances - spaces in certain places of the address. So, this still returns undef:


#!/usr/bin/perl 

use Email::Valid; 

my $address = 'some one@example.com'; # notice the space! 

my $validated_address = undef; 
if($validated_address = Email::Valid->address($address)){ 
 print "Valid!: $validated_address"; 
}
else { 
 print "NOPE!"; 
}
# prints, NOPE!

This certainly makes things confusing. The reason that the email address gets modified, is that under the hood, Email::Valid turns the address you give it, into a Mail::Address object (it'll also just accept an Mail::Address object, and Mail::Address itself will take out that pesky space in the domain part of our email address, when it's own address() method returns a string:


#!/usr/bin/perl 

use Mail::Address;

my    $address  = 'someone@example .com'; # notice the space!
my    $modified = ( Mail::Address->parse($address) )[0]->address;
print $modified;

# prints,  someone@example.com

This makes it difficult to patch up Email::Valid, since then I'd have to delve into Mail::Address - and perhaps Mail::Address has a perfectly good reason to do this. Since both Email::Valid and Mail::Address have been around for so long, and many programs have been written with their API the way it is, it's a pretty good bet they'll stay this way.

So, what to do? Since my app's validation system is a little more than, "is the form correct?" it's difficult in the  workflow to take in account of the changes Email::Valid makes to the address I pass - I'm not sure if I even like the idea: getting a boolean back in a validation method seems a better idea - most of my own valiation methods in the app have the pattern of, return a boolean status, plus a hashref saying what may be wrong,

 
# In a perfect world: 
my ($status, $errors) = $validater->check_out_this_address($address); 
if($status == 1){ 
    # Good to go!
}
else { 
    print "Dude, PROBLEMS:\n"; 
    for (keys %$errors){ 
        print '* ' . $_ . "\n"; 
    }
}


So, for example, say I'm importing a whole address book of addresses - thousands of addresses! And before I do that, I sort out the duplicate addresses, as the validation steps are many, and cpu consuming. Using Email::Valid the correct way could introduce subtle bugs into the system. Simplifying things:

 
#!/usr/bin/perl 

use Email::Valid;

# Say, I've already sorted out dupes: 
my @check_these = 
(
'one@foobar.com',
'another@foobar.com',
'another@foobar .com',
'two@foobar.com',
'blech'
); 

my @validated = (); 
my @invalid   = (); 

for(@check_these){ 
    my $address; 
    if($address = Email::Valid->address($_)) { 
        push(@validated, $address); 
    }
    else { 
        push(@invalid, $_); 
    } 
}

print "Good! Addresses:\n";
print "* $_\n" for @validated;

print "BAD! Addresses:\n";
print "* $_\n" for @invalid;
This prints out,
 
Good! Addresses:
* one@foobar.com
* another@foobar.com
* another@foobar.com
* two@foobar.com
BAD! Addresses:
* blech


As you can see, some weird things happen:
  • The number of addresses total that are returned isn't the same as what gets passed
  • All the invalid address are not reported
  • my once unique list now isn't so unique anymore. 
 
I'm not sure yet what the best thing to do is. For now, sadly, I have to do something hacky in my own email (in form) validator - I just look for a space, if there's a space, it's automatically invalid:



#!/usr/bin/perl 

use Email::Valid; 

my $address = 'someone@example. com'; # notice the space! 

if(is_valid_email($address)){ 
 print "Valid!"; 
}
else { 
 print "Nope!"; 
}

sub is_valid_email { 
 my $a = shift;
 
 if($a =~ m/\s/){ 
  return 0; 
 } 
 if(Email::Valid->address($a)){ 
  return 1; 
 }
 else { 
  return 0; 
 }
}

# prints, Nope!

So many miscalculations on my part, which! means much to learn from my own dumb self:

Firstly, I couldn't believe it was a bug, since the way things have worked hadn't been changed in eons.

Next: way back when I first wrote put this validation stuff in the app yarns ago, I was thinking that Email::Valid was nothing but a nice wrapper around a very, very complex regex created by Jeffrey Friedl back in the day for a book on Regular Expressions that he wrote. It looks a little like this. It's a doozy.

Another miscalculation I had made was, "I have tests! How could this fail!" and I made the assumption the user was in fact on the wrong. So, I looked at my tests ,and whatayaknow: 
 

TODO: {
    local $TODO = 'This test fails.';
    ok(valid_email_sub('test@example .com')           == 0); 
};

So I actually knew about this problem before, and forgot - there's actually a link to a bug tracker in the actual test, which means I had handed off the problem to someone else to fix. Lazy!

Things to learn: 

  • Read the docs! This very peculiarity is documented in Email::Valid's own docs. Don't assume!

Let's see an example of how the address may be modified:
$addr = Email::Valid->address('Alfred Neuman ');
print "$addr\n"; # prints Neuman@foo.bar

  • Just because you think everything's working, doesn't mean there aren't weird little edge cases -  it's probably a given.
  • Tests are good! Skipping tests that don't pass doesn't always help you. 
  • Reporting bugs in other modules is good! Trying to fix them? Probably better. 

My thanks to rjbs for being patient as I stumble around a module he has been nice enough to adopt, warts and all.
 

Tuesday, October 1, 2013

Dada Mail v6.7.0 Released! Save as Draft, Enhanced Attachments, jQuery Plugin

Dada Mail, v6.7.0 has been released! Hurrah! You can read the official announcement here, view the changelog, as well as download/install it yourself. If you're looking for a simple to use mailing list manager (written in Perl), give it a go! It's one of the few Perl apps that's available as a one-click install in many services that you'll see in hosting control panels, like cPanel - safe to say, it's installed a ton!

Feel free to explore away Dada Mail Lane - but that's not really what I wanted to focus on this blog post. What I really wanted to focus on, is how Perl and  CPAN has helped me ship the new features of this app in a matter of weeks. So let's do it!

Feature! Save as Draft


The Save as Draft feature in Dada Mail allows you to start writing a message, then allow you to save it as you go, so you don't have to finish creating a message in one sitting. It's very similar to the auto-save feature in the blog software I'm using right now to author this post!

When I create a new feature, there's a few things I like to make sure I'm doing right: I like to make things as simple as possible, but have room for future growth. I can't predict the future, but I still can try to think of features that I would like to work on, in the future.

For Save as Draft, the backend is super simple: the SQL table (Dada Mail supports MySQL, PostgreSQL and SQLite) looks a little bit like this:

 
CREATE TABLE IF NOT EXISTS dada_message_drafts (
id INT4 NOT NULL PRIMARY KEY AUTO_INCREMENT,
list varchar(16),
created_timestamp TIMESTAMP DEFAULT NOW(),
last_modified_timestamp TIMESTAMP,
name varchar(80), 
screen varchar(80),
role varchar(80),
draft mediumtext
);

id is your primary key,

list is used to distinguish what list the draft messages belong to.

created_timestamp and last_modified_timestamp give some file status-type information on your draft, since you can think of a draft as a sort of document. It's nice for example, to be able to sort your drafts by their last_modified_timestamp, so we can always load up the most recent draft, when someone visits the screen that we use to edit a message.

Speaking of the two timestamp cols in thw same table,  there seems to be many different opinions on how exactly to use them together, as some databases don't allow you to have >1 cols with a default timestamp values. Again, for simplicity sake, I decided to make INSERTS set the created_timestamp automatically, and setting the last_modified_timestamp explicitly:

INSERT INTO dada_message_drafts (list, screen, role, draft, last_modified_timestamp) 
VALUES (?,?,?,?, NOW());

and then UPDATES would again just explicitly update the last_modified_timestamp

UPDATE dada_message_drafts SET screen = ?, role = ?, draft = ?, last_modified_timestamp = NOW() WHERE list = ? AND id = ?;

Again: simple, simple, simple.

A bigger problem I faced was basically how to save what is going to be an ever-changing set of form fields. I didn't want the Save as Draft feature to dictate what I'd be presenting the user in terms of widgets and things that they can utilize.  

Maybe one day, I want to set up how the Partial List Sending feature works differently in the next version, but not have to worry about having a column in this table for drafts now useless, or worse: shoehorned for a feature it wasn't meant to be -

 I didn't want this:

 
CREATE TABLE IF NOT EXISTS dada_message_drafts (
id INT4 NOT NULL PRIMARY KEY AUTO_INCREMENT,
list varchar(16),
created_timestamp TIMESTAMP DEFAULT NOW(),
last_modified_timestamp TIMESTAMP,
name varchar(80), 
screen varchar(80),
role varchar(80),
header_subject,
custom_header_1,
message_plaintext,
message_html mediumtext, 
attachment_1 mediumtext, 
attachment_2 mediumtext, 
attachment_3 mediumtext, 
custom_widget_value_1 mediumtext
);

Blech. Rather, it made more sense to encode all the form values together, and have an easy way to decode them, which runs right into the next problem:

Now that I have the values to fill in my form - well, how do I do that, easily - and more importantly with the flexibility I want. It's a big-time drain problem - sure: filling out a text field is easy, but how about a check box array, or a popup form? Forget it!

So, of course there's a solution in Perl/CPAN! And the one I decided upon was HTML::FIllinForm::Lite. Lite, because this app is shipped with any required modules (the app is easy to install for people who know nothing of cpan/cpanm/perlbrew/so one and so forth) and that also means no XS modules, outside the core.

HTML::FillinForm::Lite requires a datasource to use to fill in the HTML form fields (that you also supply) and one of the data sources it supports is an object with a param() method, like CGI.pm (yes! I still use CGI.pm - Dada Mail requires only Perl v5.8.1 and a CGI env. to work!) CGI.pm itself has a way to "save the state of the form", which sounds goofy, but is exactly what we need to do! The examples in its own docs only show how to write to a filehandle, and it's a little easier to passa scalar to a database handle, so you can do something like this very Perlish trick, to get the data into a string form:

 
sub stringify_cgi_params {

    my $self = shift;
    my ($args) = @_;

    if ( !exists( $args->{-cgi_obj} ) ) {
        croak "You MUST pass a, '-cgi_obj' parameter!";
    }

    my $q = $args->{-cgi_obj};
    $q = $self->remove_unwanted_params(
        {
            -cgi_obj => $args->{-cgi_obj},
        }
    );

    my $buffer = "";
    open my $fh, ">", \$buffer or die 'blarg!' . $!;
    $q->save($fh);
    return $buffer;
}



The old, "open filehandles directly to Perl scalars" trick (open my $fh, ">", \$buffer or die 'blarg!' . $!;)

Once we want to repopulate the form, it's easy enough to decode those form fields, again, like so:


sub decode_draft {
    my $self  = shift;
    my $saved = shift;
    open my $fh, '<', \$saved || die $!;
    require CGI;
    my $q = CGI->new($fh);
    return $q;
}


$q is now a CGI object, that we can pass right to HTML::FillinForm::Lite: 

require HTML::FillInForm::Lite;
        my $h       = HTML::FillInForm::Lite->new();
        $str = $h->fill( \$html_str, $q );



Couldn't be any simpler.

Features! Attachments



Dada Mail has supported sending a message out with attachments for quite some time - I can't remember I time it didn't, but it only presented the user with a file upload widget, for the user to pick a local file. Not very flexible, and not something that was going to work, as it's not really possible to save the state of the file upload widget without really jumping through some flaming hopes - so let's not.

What I really needed was something a lot more flexible: I wanted to allow the user to pick a file from their local computer, as well as have available files that have already been uploaded. I also wanted it easy to add an attachment, but also remove one, if they decide against it. Finally, and most importantly, it had to all work with the new Drafts feature - so, if we're using a different form field - something HTML::FillinForm::Lite can fill in, that would be ideal.

Dada Mail already ships with a a web file browser called, KCFinder, which itself is an alternative to a somewhat expensive add-on to CKEditor called, CKFinder. Luckily, KCFinder works flippin' great as a image uploader for Dada Mail's HTML message text box. Could I use it to simply fill in the value of a form field?

I wasn't looking forward to reverse engineering how KCFinder talks to CKEditor, but to my surprise, there was an already-working example on KCFinder's site, that shows you how to open a new KCFinder window, and then capture the name of the file selected, for you to do anything you want with - say fill in a text field. Perfect! With a few flurishes, that feature was stamped out, in record time.

Incidentally, KCFinder has a semblance of security by using a sessioning system, so that only the right people can have access to its file upload/browsing feature. You don't want just anyone to be able to upload file type X to your server. Nightmarish!

Dastardly, KCFinder's server backend is in PHP, so it uses PHP-style-sessioning. Not the happiest place to be, if you have a Perl app. Thankfully, there's PHP::Session to bridge the PHP-style, Perl-style gap. Saved again!

 Feature! jQuery Plugin


Perhaps some will remember the Bad Old Days, when one's JavaScript was a mish-mash of ill-begotten weird lines of noise - back before their was a framework of any way/shape/form to 'rastle all this stuff together. Dada Mail harkens back to the winter of 1999, and it's gone through many iterations of its JavaScript functionality - from minor flourishes, to utilizing prototype.js and Scriptaculous, and of course porting all that to jQuery.

With learning jQuery, one wants to make a jQuery plugin, and one learns sort of a dirty little secret: there's no one way to do so! It's filled with compromises (I mean - check it! It even states: Remember, it's a compromise!), and metric ton of boilerplate code. You can't really think of a plugin as having a namespace, as plugins kinda only support one, "method". It makes Perl's OO system look exceptionally attractive. What a train wreck. Sigh.

So, I guess(?) you just sort of have to except that this a very warty part of jQuery, and JavaScript is sort of a very warty thing in of itself, but there's no alternative. My only real constraint that I wanted was to have > 1 method available. For this plugin: I wanted to be able to create a subscription form, that ties into a Dada Mail, mailing list, and I also wanted a method to control an already existed form, so that a designer can have full control on the presentation of the form, if they don't like what gets created by the plugin itself.

I used this boilerplate as a basis for my own plugin (179 lines! Of boilerplate!).  and created the two, "methods" I wanted, and once you get your brain around how things work, it was, eh, OK. Once the plugin takes control of a form, either one it creates, or one it, "takes control of", I wanted to simply show the results of a subscription request, in a modal window.  Like-a-dis:



Luckily, Dada Mail already takes advantage of another jQuery plugin, called Colorbox, so it was easy enough to leverage that for our modal window. The trick really was having jQuery talk to Dada Mail, and for that we use Dada Mail's sparkling new RESTful interface, that jQuery can talk to via JSONP, which itself is a weird hackity hack, to get over a security thing-a-mabob. Great!

Like a ton of things in Perl, working with JSON is pretty darn easy. Here's how we take a request from jQuery:

$.ajax({
 url: $("#" + targetForm).attr("action") + '/json/subscribe',
 type: "POST",
 dataType: "jsonp",
 cache: false,
 data: JSON.stringify(
  { 
   list:  $("#" + targetForm + " :input[name='list']").val(),
   email: $("#" + targetForm + " :input[name='email']").val(),
   fields: fields
   }
 ),
    contentType: "application/json; charset=UTF-8",
 success: function(data) {
  /* Do something, after success! */
  $.colorbox({
   html: '<h1>
Hey, that worked!</h1>
',
   opacity: 0.50
  });
  
 },
 error: function(xhr, ajaxOptions, thrownError) {
  console.log('status: ' + xhr.status);
  console.log('thrownError:' + thrownError);
  $.colorbox({
   html: '<h1>Apologies,</h1>An error occured while processing your request. Please try again in a few minutes.',
   opacity: 0.50
  });
 }
});


And here's how we could handle the request:

    require JSON::PP;
    my $json = JSON::PP->new->allow_nonref;
    if ( !$q->content_type || $q->content_type =~ m/text\/html/ ) {
        # Perhaps, die here? Show the API docs? 
    }
    elsif ( $q->content_type =~ m/application\/json/ ) {
        # well. OK, then.
    }
    else {
        die '425';
    }
    my $post_data = $q->param('POSTDATA');

    my $data = undef;
    try {
        $data = $json->decode($post_data);
    }
    catch {
        die '400';
    };

    # Now, we can work with $data! 


the $data hashref now contains the parameters that the jQuery stuff passes to us (list to subscribe to, email to subscribe with, etc). I did a ton of housekeeping in Dada Mail's own system to make it dead easy to go through the subscription process, and have the method return a quite the complex data structure - with which you can gleam all sorts of information from: was it successful? No? What was the problem? What should I do now - show a message? redirect to a different URL?

A complex data structure is fine, but it's a Perl complex data structure. Thankfully, turning a data structure into JSON for jQuery to understand is again, dead simple:


    my $fancy_data = { 
         email => 'you@yours.com', 
         list  => 'somelist', 
         status => 1, 
         fancy_message => 'Hey, that worked! Hurrah!', 
    }; 
    require JSON::PP;
    my $json = JSON::PP->new->allow_nonref;
    my $data_back = $json->pretty->encode($fancy_data);


$data_back will now be the following JSON:
 
{
   "email" : "you@yours.com",
   "status" : 1,
   "fancy_message" : "Hey, that worked! Hurrah!",
   "list" : "somelist"
}


I'm actually really glad I had to figure out how to make things work by creating this JSON response, as it truly forced me to make the Perl code clear, pragmatic, and a whole less complex - which surprisingly made things a lot more flexible. Kind of like being in the middle of a solitaire game you know you're going to win, there's more simplification I can do to really drill down and make things way easier to read. Again: no real tricks, just elbow grease and going in there and making small changes, which add up!

Here's the docs to the jQuery plugin, as well as the RESTful interface (which is still very much in a moving-and-shaking phase of things - I still have a ton to learn!), that'll give you a better idea on how this all fits together - it's complete with full examples of using the interface in both a Perl app, as well as your own jQuery app. The jQuery plugin for this release can be viewed here.


Friday, September 6, 2013

Dada Mail, v6.6.0 Released!

I've just put out a new release of Dada Mail - v6.6.0. Its  a web-based mailing list manager, written in Perl. You can check out/fork/push/pull the source code here. Some of its main goals is to be pretty easy to install by mere mortals, easy to work with, yet pretty darn powerful and flexible, to fill a ton of different needs - it's not really trying to be a replacement to the creaky Majordomo, or Mailman. 

I'm heading up to my 14th year of developing the application. It's currently free to download from its own support site as well as from various installer services that have popped up in the last few years (Mojo Marketplace, Installatron, Softaculous, etc) Follow along with the project on Twitter, or hey - we have a mailing list too!

There's lots of things a like about working on it, but the application has become quite monolithic, and a little less flexible in its underlying code, than I'd like. In the next few months, I'm hoping to alleviate that, without falling into the problem of totally rewriting the app (it would be almost impossible to do so, with a one-dude team).

My initial idea is to move from basically no framework, to working with CGI::Application. That should help me gain some sort of framework, and a way to break the overgrowth of parts of this app, into more manageable pieces. It'll also hopefully give me a bridge to be able to have Dada Mail run as a CGI script (how people are used to running Dada Mail), as well as a PSGI app (a great alternative, that has a TON of benefits, one being SPEED!).

From there, we'll see how it works, but it may be time to move on to some more whiz-bang things that a framework like Mojolicious would provide. The bummer with trying to move to Mojolicious is that the Perl version required is rarely met on the shared hosting accounts that the majority of people who run Dada Mail are on. Saying, "hey, just install perlbrew!" is a far cry on what lots of people are capable of. They're used to a one-click install. Sigh.

Maybe cPanel, being a Perl-based shop, knows of a better way to have available a more up-to-date Perl version? A how-to, so not to fudge this up, would be super-helpful and somewhat break huge shackles from those that ship apps for this types of hosting situations. (maybe you know?)

Without further ado, here's the changelog from this last release: