php

regex

email

email-validation

I have this function to validate an email addresses:

function validateEMAIL($EMAIL) {
    $v = "/[a-zA-Z0-9_-.+][email protected][a-zA-Z0-9-]+.[a-zA-Z]+/";

    return (bool)preg_match($v, $EMAIL);
}

Is this okay for checking if the email address is valid or not?

Solution 1

The easiest and safest way to check whether an email address is well-formed is to use the filter_var() function:

if (!filter_var($email, FILTER_VALIDATE_EMAIL)) {
    // invalid emailaddress
}

Additionally you can check whether the domain defines an MX record:

if (!checkdnsrr($domain, 'MX')) {
    // domain is not valid
}

But this still doesn't guarantee that the mail exists. The only way to find that out is by sending a confirmation mail.


Now that you have your easy answer feel free to read on about email address validation if you care to learn or otherwise just use the fast answer and move on. No hard feelings.

Trying to validate an email address using a regex is an "impossible" task. I would go as far as to say that that regex you have made is useless. There are three rfc's regarding emailaddresses and writing a regex to catch wrong emailadresses and at the same time don't have false positives is something no mortal can do. Check out this list for tests (both failed and succeeded) of the regex used by PHP's filter_var() function.

Even the built-in PHP functions, email clients or servers don't get it right. Still in most cases filter_var is the best option.

If you want to know which regex pattern PHP (currently) uses to validate email addresses see the PHP source.

If you want to learn more about email addresses I suggest you to start reading the specs, but I have to warn you it is not an easy read by any stretch:

Solution 2

You can use filter_var for this.

<?php
   function validateEmail($email) {
      return filter_var($email, FILTER_VALIDATE_EMAIL);
   }
?>

Solution 3

In my experience, regex solutions have too many false positives and filter_var() solutions have false negatives (especially with all of the newer TLDs).

Instead, it's better to make sure the address has all of the required parts of an email address (user, "@" symbol, and domain), then verify that the domain itself exists.

There is no way to determine (server side) if an email user exists for an external domain.

This is a method I created in a Utility class:

public static function validateEmail($email)
{
    // SET INITIAL RETURN VARIABLES

        $emailIsValid = FALSE;

    // MAKE SURE AN EMPTY STRING WASN'T PASSED

        if (!empty($email))
        {
            // GET EMAIL PARTS

                $domain = ltrim(stristr($email, '@'), '@') . '.';
                $user   = stristr($email, '@', TRUE);

            // VALIDATE EMAIL ADDRESS

                if
                (
                    !empty($user) &&
                    !empty($domain) &&
                    checkdnsrr($domain)
                )
                {$emailIsValid = TRUE;}
        }

    // RETURN RESULT

        return $emailIsValid;
}

Solution 4

I think you might be better off using PHP's inbuilt filters - in this particular case:

It can return a true or false when supplied with the FILTER_VALIDATE_EMAIL param.

Solution 5

This will not only validate your email, but also sanitize it for unexpected characters:

$email  = $_POST['email'];
$emailB = filter_var($email, FILTER_SANITIZE_EMAIL);

if (filter_var($emailB, FILTER_VALIDATE_EMAIL) === false ||
    $emailB != $email
) {
    echo "This email adress isn't valid!";
    exit(0);
}

Solution 6

After reading the answers here, this is what I ended up with:

public static function isValidEmail(string $email) : bool
{
    if (!filter_var($email, FILTER_VALIDATE_EMAIL)) {
        return false;
    }

    //Get host name from email and check if it is valid
    $email_host = array_slice(explode("@", $email), -1)[0];

    // Check if valid IP (v4 or v6). If it is we can't do a DNS lookup
    if (!filter_var($email_host,FILTER_VALIDATE_IP, [
        'flags' => FILTER_FLAG_NO_PRIV_RANGE | FILTER_FLAG_NO_RES_RANGE,
    ])) {
        //Add a dot to the end of the host name to make a fully qualified domain name
        // and get last array element because an escaped @ is allowed in the local part (RFC 5322)
        // Then convert to ascii (http://us.php.net/manual/en/function.idn-to-ascii.php)
        $email_host = idn_to_ascii($email_host.'.');

        //Check for MX pointers in DNS (if there are no MX pointers the domain cannot receive emails)
        if (!checkdnsrr($email_host, "MX")) {
            return false;
        }
    }

    return true;
}

Solution 7

Use below code:

// Variable to check
$email = "[email protected]";

// Remove all illegal characters from email
$email = filter_var($email, FILTER_SANITIZE_EMAIL);


// Validate e-mail
if (filter_var($email, FILTER_VALIDATE_EMAIL)) {
  echo("Email is a valid email address");
}

Solution 8

Answered this in 'top question' about emails verification https://stackoverflow.com/a/41129750/1848217

For me the right way for checking emails is:

  1. Check that symbol @ exists, and before and after it there are some [email protected] symbols: /^[^@][email protected][^@]+$/
  2. Try to send an email to this address with some "activation code".
  3. When the user "activated" his email address, we will see that all is right.

Of course, you can show some warning or tooltip in front-end when user typed "strange" email to help him to avoid common mistakes, like no dot in domain part or spaces in name without quoting and so on. But you must accept the address "[email protected]" if user really want it.

Also, you must remember that email address standard was and can evolute, so you can't just type some "standard-valid" regexp once and for all times. And you must remember that some concrete internet servers can fail some details of common standard and in fact work with own "modified standard".

So, just check @, hint user on frontend and send verification emails on given address.

Solution 9

If you want to check if provided domain from email address is valid, use something like:

/*
* Check for valid MX record for given email domain
*/
if(!function_exists('check_email_domain')){
    function check_email_domain($email) {
        //Get host name from email and check if it is valid
        $email_host = explode("@", $email);     
        //Add a dot to the end of the host name to make a fully qualified domain name and get last array element because an escaped @ is allowed in the local part (RFC 5322)
        $host = end($email_host) . "."; 
        //Convert to ascii (http://us.php.net/manual/en/function.idn-to-ascii.php)
        return checkdnsrr(idn_to_ascii($host), "MX"); //(bool)       
    }
}

This is handy way to filter a lot of invalid email addresses, along with standart email validation, because valid email format does not mean valid email.

Note that idn_to_ascii() (or his sister function idn_to_utf8()) function may not be available in your PHP installation, it requires extensions PECL intl >= 1.0.2 and PECL idn >= 0.1.

Also keep in mind that IPv4 or IPv6 as domain part in email (for example [email protected][IPv6:2001:db8::1]) cannot be validated, only named hosts can.

See more here.

Solution 10

If you're just looking for an actual regex that allows for various dots, underscores and dashes, it as follows: [a-zA-z0-9.-]+\@[a-zA-z0-9.-]+.[a-zA-Z]+. That will allow a fairly stupid looking email like [email protected]_matrix.com to be validated.

Solution 11

/(?![[:alnum:]]|@|-|_|\.)./

Nowadays, if you use a HTML5 form with type=email then you're already by 80% safe since browser engines have their own validator. To complement it, add this regex to your preg_match_all() and negate it:

if (!preg_match_all("/(?![[:alnum:]]|@|-|_|\.)./",$email)) { .. }

Find the regex used by HTML5 forms for validation
https://regex101.com/r/mPEKmy/1

Solution 12

theres is a better regex built in FILTER_VALIDATE_EMAIL but any regex can give bad results.

For example..

// "not an email" is invalid so its false.
php > var_export(filter_var("not an email", FILTER_VALIDATE_EMAIL));
false
// "[email protected]" looks like an email, so it passes even though its not real.
php > var_export(filter_var("[email protected]", FILTER_VALIDATE_EMAIL));
'[email protected]'
// "[email protected]" passes, gmail is a valid email server,
//  but gmail require more than 3 letters for the address.
var_export(filter_var("[email protected]", FILTER_VALIDATE_EMAIL));
'[email protected]'

You might want to consider using an API like Real Email which can does in depth mailbox inspections to check if the email is real.

A bit like ..

$email = "[email protected]";
$api_key = ???;

$request_context = stream_context_create(array(
    'http' => array(
        'header'  => "Authorization: Bearer " . $api_key
    )
));

$result_json = file_get_contents("https://isitarealemail.com/api/email/validate?email=" . $email, false, $request_context);

if (json_decode($result_json, true)['status'] == "valid") {
    echo("email is valid");
} else if (json_decode($result_json, true)['status'] == "invalid") {
    echo("email is invalid");
} else {
  echo("email was unknown");
}

Solution 13

There are three RFCs that lay down the foundation for the "Internet Message Format".

  1. RFC 822
  2. RFC 2822 (Supersedes RFC 822)
  3. RFC 5322 (Supersedes RFC 2822)

The RFC 5322, however, defines the e-mail IDs and their naming structure in the most technical manner. That is more suitable laying down the foundation an Internet Standard that liberal enough to allow all the use-cases yet, conservative enough to bind it in some formalism.

However, the e-mail validation requirement from the software developer community, has the following needs -

  • to stave off unwanted spammers
  • to ensure the user does not make inadvertent mistake
  • to ensure that the e-mail ID belongs to the actual person inputting it

They are not exactly interested in implementing a technically all-encompassing definition that allows all the forms (IP addresses, including port IDs and all) of e-mail id. The solution suitable for their use-case is expected to solely ensure that all the legitimate e-mail holders should be able to get through. The definition of "legitimate" differs vastly from technical stand-point (RFC 5322 way) to usability stand-point(this solution). The usability aspect of the validation aims to ensure that all the e-mail IDs validated by the validation mechanism belong to actual people, using them for their communication purposes. This, thus introduces another angle to the validation process, ensuring an actually "in-use" e-mail ID, a requirement for which RFC-5322 definition is clearly not sufficient.

Thus, on practical grounds, the actual requirements boil down to this -

  1. To ensure some very basic validation checks
  2. To ensure that the inputted e-mail is in use

Second requirement typically involves, sending a standard response seeking e-mail to the inputted e-mail ID and authenticating the user based on the action delineated in the response mechanism. This is the most widely used mechanism to ensure the second requirement of validating an "in use" e-mail ID. This does involve round-tripping from the back-end server implementation and is not a straight-forward single-screen implementaion, however, one cannot do away with this.

The first requirement, stems from the need that the developers do not want totally "non e-mail like" strings to pass as an e-mail. This typically involves blanks, strings without "@" sign or without a domain name. Given the punycode representations of the domain names, if one needs to enable domain validation, they need to engage in full-fledged implementation that ensures a valid domain name. Thus, given the basic nature of requirement in this regard, validating for "<something>@<something>.<something>" is the only apt way of satisfying the requirement.

A typical regex that can satisfy this requirement is: ^[^@\s][email protected][^@\s.]+.[^@\s.]+$ The above regex, follows the standard Perl regular-expression standard, widely followed by majority of the programming languages. The validation statement is: <anything except whitespaces and "@" sign>@<anything except whitespaces and "@" sign>.<anything except whitespaces, @ sign and dot>

For those who want to go one step deeper into the more relevant implementations, they can follow the following validation methodology. <e-mail local part>@<domain name>

For <e-mail local part> - Follow the guidelines by the "Universal Acceptance Steering Group" - UASG-026 For <domain name>, you can follow any domain validation methodology using standard libraries, depending on your programming language. For the recent studies on the subject, follow the document UASG-018A.

Those who are interested to know the overall process, challenges and issues one may come across while implementing the Internationalized Email Solution, they can also go through the following RFCs:

RFC 6530 (Overview and Framework for Internationalized Email) RFC 6531 (SMTP Extension for Internationalized Email) RFC 6532 (Internationalized Email Headers) RFC 6533 (Internationalized Delivery Status and Disposition Notifications) RFC 6855 (IMAP Support for UTF-8) RFC 6856 (Post Office Protocol Version 3 (POP3) Support for UTF-8) RFC 6857 (Post-Delivery Message Downgrading for Internationalized Email Messages) RFC 6858 (Simplified POP and IMAP Downgrading for Internationalized Email).

Solution 14

The question title is fairly generic, however the body of the question indicates that it is about the PHP based solution. Will try to address both.

Generically speaking, for all programming languages: Typically, validating" an e-mail address with a reg-ex is something that any internet based service provider should desist from. The possibilities of kinds of domain names and e-mail addresses have increased so much in terms of variety, any attempt at validation, which is not well thought may end up denying some valid users into your system. To avoid this, one of the best ways is to send an email to the user and verify it being received. The good folks at "Universal Acceptance Steering Group" have compiled a languagewise list of libraries which are found to be compliant/non-compliant with various parameters involving validations vis-a-vis Internationalized Domain Names and Internationalized Email addresses. Please find the links to those documents over here and here.

Speaking specifically of PHP: There is one good library available in PHP i.e. EmailValidator. It is an email address validator that includes many validation methods such as DNS validation. The validator specifically recommended is called RFCValidator and validates email addresses against several RFCs. It has good compliance when it comes to being inclusive towards IDNs and Internationalized Email addresses.

Solution 15

I've made Python & PHP implementations of properly verify ANY email address, that is confirmed as real one from the mailserver for the domain that is real.

Released under GPL-3.0 lisence.

There you go:

https://lja.fi/index.php/github-stuff/

--lja

Solution 16

I have prepared a function that checks email validity:

function isValidEmail($email)
{
    $re = '/([\w\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([\w-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)/m';
    preg_match_all($re, $email, $matches, PREG_SET_ORDER, 0);
    if(count($matches) > 0) return $matches[0][0] === $email;
    return false;
}

The problem with FILTER_VALIDATE_EMAIL is that it considers even invalid emails as valid.

Following are example:

if(isValidEmail("[email protected]")) echo "valid";
if(!isValidEmail("fo^[email protected]")) echo "invalid";