注册 登录  
 加关注
查看详情
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

FY

Johnson 's Blog

 
 
 

日志

 
 

A Perl Regular Expression That Matches Prime Numbers  

2011-12-27 13:39:56|  分类: perl |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

A Perl Regular Expression That Matches Prime Numbers

Click Here
perl -lne '(1x$_) =~ /^1?$|^(11+?)\1+$/ || print "$_ is prime"'

Can you figure out how it works? I give an explanation below, but try to figure it out yourself. Here is what happens when you run it:

$ perl -lne '(1x$_) =~ /^1?$|^(11+?)\1+$/ || print "$_ is prime"'
1
2
2 is prime
3
3 is prime
4
5
5 is prime
6
7
7 is prime
8
9
10
11
11 is prime

Here is how it works.

First, the number is converted in its unary representation by (1x$_). For example, the number 5 gets converted into 1x5, which is 11111 (1 repeated 5 times.)

Next, the unary string gets tested against the regular expression. If it matches, the number is a composite, otherwise it's a prime.

The regular expression works this way. It consists of two parts ^1?$ and ^(11+?)\1+$.

The first part matches number 1 and the empty string. Clearly, the empty string and number 1 are not prime numbers, therefore this regular expression matches, which indicates that they are not prime numbers.

The second part determines if two or more 1s repeatedly make up the whole number. If two or more 1s repeatedly make up the whole number, the regex matches, which means that the number is composite. Otherwise it's a prime.

Let's look at the second regex part on numbers 5 and 4.

The number 5 in unary representation is 11111. The (11+?) matches the first two ones 11. The back-reference \1 becomes 11 and the whole regex now becomes ^11(11)+$. It can't match five ones, therefore it fails. But since it used +?, it backtracks and matches the first three ones 111. The back-reference becomes 111 and the whole regex becomes ^111(111)+$. It doesn't match again. This repeats for 1111 and 11111, which also don't match, therefore the whole regex doesn't match and the number is a prime.

The number 4 in unary representation is 1111. The (11+?) matches the first two ones 11. The back-reference \1 becomes 11 and the regex becomes ^11(11)+$. It matches the original string, therefore the number is not a prime.

PS. I didn't invent this regular expression, it was invented in 1998 by Abigail.

Don't take this regular expression too seriously, it's actually neither a regular expression (as defined in automata theory), nor a way to check if a number is a prime. It's just an awesome thing that Perl can do. See this cool article called The Prime That Wasn't by Andrei Zmievski for a discussion about how this regex fails for larger numbers because of backtracking.

Also if you wish to learn more about Perl one-liners, check out my Perl One-Liners Explained article series and download the perl1line.txt file.

  评论这张
 
阅读(397)| 评论(0)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2018