[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Regex Question



"Diego Gardel" <dgardel@hotmail.com> wrote ..
> I'm one of the good guys.

Sure you are, mr. hotmail. Trust me, I *am* one of the "good guys", 
and I don't know anyone that performs bulk data mining for plaintext credit card numbers on their own systems for "good" reasons.

Are you part of an info-sec, audit, or pen-test team? And if so, you should have access to credit card account specification data that will tell you everything you need to know about credit card numbers. i.e. the requirements for the program you are writing should spell out _exactly_ what you need to look for. Otherwise, you can't say that your program actually works or not, and your analyst needs to do more work specifying what it is supposed to do.

If you're looking for the fastest way to locate *possible* credit card numbers,
then I would look for a sequence of 16 digits. You'll have to decide for yourself what characters constitute sequence terminators, and what characters are allowed between digits (and how many). Once you have a candidate, then you can perform the checksum test -- see any of the validation source code for the details. If your data still passes, then flag the file/stream as potentially containing CC numbers, and you're done.

There is no way we can offer a more specific regex solution without you sharing with us some statistical attributes of the data you are examining. i.e. text files (e.g. email or HTML), BCD data, formatted data, including expiration data, VVS data, etc. You'll need to specify the things that tell us what characters might be allowed between digits (e.g. spaces and dashes), and what the sequence-terminating characters might be (e.g. non-digits, excluding the inter-digit characters) to come up with a decent regex.

Mike/


-
To unsubscribe, send email to majordomo@luci.org with
"unsubscribe luci-discuss" in the body.