Need to match zip codes or postal codes? Of course, no two countries use the same format. But here are solutions for USA, Canada, and the United Kingdom.
The USA format is simple, five digits as 99999 or zip+4 as 99999-9999. A simple RegEx could be:
\d{5}(-\d{4})?
If you want to omit zips with a trailing hyphen (as in 99999-) then you could use a lookahead condition:
\d{5}(?(?=-)-\d{4})
Canada is a little trickier, the format looks like A1A 1A1 which can be easily matched with:
[A-Z]\d[A-Z] \d[A-Z]\d
However, there is one rule that may be employed to improve validation, the opening character of the set of characters (technically called the "forward sortation area" or FSA) identifies the province, territory, or region (there are 18 characters that are valid in this position, A for Newfoundland and Labrador, B for Nova Scotia, K, L, N and P for Ontario excluding Toronto which uses M, and so on.), and so validation should ideally check to ensure that the first character is a valid one. And so, here is a better Canadian postal code regular expression:
[ABCEGHJKLMNPRSTVXY]\d[A-Z] \d[A-Z]\d
Good old UK is the trickiest of the three. United Kingdom postcodes, as defined by the Royal Mail, are five, six, or seven characters and digits (that includes a single space). Postcodes are made up of two parts, the "outward postcode" (or outcode), and the "inward postcode" (or incode). The outcode is one or two alphabetical characters followed by one or two digits, or one or two characters followed by digit and a character. The incode is always a single digit followed by 2 characters (any characters excluding C, I, K, M, O, and V). The incode and outcode are separated by a space. Here's the regular expression:
[A-Z]{1,2}\d[A-Z\d]? \d[ABD-HJLNP-UW-Z]{2}
If you have any other countries or formats to share, please do so.
http://www.upu.int/post_code/en/addressing_formats...
for specific in-country info (i'll bet some countries postal codes vary internally):
http://www.upu.int/post_code/en/list_of_sites_by_c...
zips in various countries is to consult the perl module <a href="http://search.cpan.org/perldoc/Regexp::Common::zip...;.
^(?:(?:(?:0[13578]|1[02])([\/|\-|\.]?)31)\1|(?:(?:0[1,3-9]|1[0-2])([\/|\-|\.]?)(?:29|30)\2))(?:(?:1[6-9]|[2-9]\d)\d{2})$|^(?:02([\/|\-|\.]?)29\3(?:(?:(?:1[6-9]|[2-9]\d)(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))$|^(?:(?:0[1-9])|(?:1[0-2]))([\/|\-|\.]?)(?:0[1-9]|1\d|2[0-8])\4(?:(?:1[6-9]|[2-9]\d)\d{2})$
It's probably the most complicated one I have, but it works really well and can be tweaked for different formats.
Phone numbers:
(^[0-9]{10}$)|(^\([0-9]{3}\) ?[0-9]{3}\-[0-9]{4}$)|(^[0-9]{3}([\.\-])([0-9]{3})\4[0-9]{4}$)
SSNs:
(^[0-9]{9}$)|(^[0-9]{3}\-[0-9]{2}\-[0-9]{4}$)
A different approach to emails:
^[\w-\.]+@(\w+[\w-]+\.){0,3}\w+[\w-]+\.[a-zA-Z]{2,4}$
How about even last names:
^[A-Za-z]*[\.\ '\-]?[A-Za-z]*$
^[A-Za-z]{1,2}[\d]{1,2}([A-Za-z])?\s?[\d][A-Za-z]{2}$
[A-Z]{1,2}\d[A-Z\d]? \d[ABD-HJLNP-UW-Z]{2}
Wont this also match:
AAAN NAA
(A=character)
{N=digit)
which it shouldnt do!!!
As opposed to many countries' postcodes, the Dutch postcodes are unique per block or at least street. So stating the postcode plus house number refers to a unique house.
Jerry
So the complete expression is:
[ABCEGHJKLMNPRSTVXY]\d[ABCEGHJKLMNPRSTVWXYZ] \d[ABCEGHJKLMNPRSTVWXYZ]\d
For more information about Canadian postal codes or geocoding data, check out our site at http://www.infinitegravity.ca
[ABCEGHJKLMNPRSTVXY]\d[ABCEGHJKLMNPRSTVWXYZ] \d[ABCEGHJKLMNPRSTVWXYZ]\d$
^ ?(([BEGLMNSWbeglmnsw][0-9][0-9]?)|(([A-PR-UWYZa-pr-uwyz][A-HK-Ya-hk-y][0-9][0-9]?)|(([ENWenw][0-9][A-HJKSTUWa-hjkstuw])|([ENWenw][A-HK-Ya-hk-y][0-9][ABEHMNPRVWXYabehmnprvwxy])))) ?[0-9][ABD-HJLNP-UW-Zabd-hjlnp-uw-z]{2}$
@Martin
That regular expression won't match "AAAN NAA". Though that comment was from over three years ago!
i like to match any address in uk with house number 621 and postcode le26un
With some work, I found the following works when there is a space AND no space between the first and second half of the code.
@regexpr(OWNER, "[ABCEGHJKLMNPRSTVXY][0-9][A-Z]( |)[0-9][A-Z][0-9]")
^[a-zA-Z][0-9A-Za-z]{2}-[0-9]{2}-[0-9A-Za-z]{3}$
which dictate some of the characters. For this reason a simple syntax definition (incorrect in itself as shown) is insuffient. Loc8
was designed to satisfy safety critical requirements an therefore basic syntax validation is wholly insuffient and foolhardy. Rights to to
the inherent validation calculations are reserved. If anyone wishes to dicuss use of the correct format;- please contact me directly
Many tks
Gary