Easy-to-follow video tutorials help you learn software, creative, and business skills.Become a member

The history of regular expressions

From: Using Regular Expressions

Video: The history of regular expressions

In this movie, we'll take a look at the history of regular expressions. Now the history is not merely academic; it's actually important in understanding some of the key points in how regular expressions work. I think the most surprising part of the history of regular expressions is they first got their start in the field of neuroscience, way back in the 1940s. In 1943, McCulloch and Pitts developed models describing how the human nervous system works, or how a machine or a computer could be built to act more like a human brain. In 1956, Stephen Kleene described these models with an algebra that he called regular sets, and he created a notation to express them called regular expressions.

The history of regular expressions

In this movie, we'll take a look at the history of regular expressions. Now the history is not merely academic; it's actually important in understanding some of the key points in how regular expressions work. I think the most surprising part of the history of regular expressions is they first got their start in the field of neuroscience, way back in the 1940s. In 1943, McCulloch and Pitts developed models describing how the human nervous system works, or how a machine or a computer could be built to act more like a human brain. In 1956, Stephen Kleene described these models with an algebra that he called regular sets, and he created a notation to express them called regular expressions.

That's where we get the name from; Stephen Kleene is the one who coined it. But at this point, regular expressions have still not entered the computer world; they are not part of the digital age yet. It's not until 1968 when Ken Thompson, an early computer pioneer and one of the key developers of UNIX implemented regular expressions inside an early UNIX text editor that he was building called ed. This is the point at which regular expressions entered the computer world, and it happens right there at the birth of UNIX. So the future of regular expressions and UNIX is very much tied together.

Now if you were a user of this text editor, ed, and you wanted to search the text for a regular expression, you would do it by typing a g and then a forward slash, and then the set of symbols that made up through regular expression for what you wanted to search for, and then at the end, another forward slash and a p. The g and the p were modifiers. The g was telling you to globally search for this expression, search everywhere, and p was to output the results to the screen, to print them. So we end up with a global regular expression print, or for short, grep. It becomes a verb.

In the UNIX world, you're able to say, I want to grep something and it means that you want to search it for a regular expression. grep became so popular that it actually became a stand-alone program so that you could grep things in the UNIX file system as well, and it became widely used in other UNIX programs. So it really kind of spreads its way throughout the UNIX ecosystem. Now in my course UNIX for Mac OS X Users, I describe in some detail how UNIX became very popular during the 1970s. The short version of that story is it was high quality software that was free.

Furthermore, these two factors made it very attractive to universities, and these universities then taught the next generation of computer stars using UNIX. So it helped to spread not only UNIX, but regular expressions as well. Now throughout the 1970s, UNIX spreads in popularity and it begins to evolve. At the same time, grep begins to evolve as well. Now there's a problem with evolving the regular expression language. We have a set of symbols that clearly define something that matches and doesn't match.

Well, if you start changing the syntax of those symbols, then we create issues of backwards compatibility. Imagine if you had a basic character that didn't have a special meaning in one version, but then in a future version suddenly now that character has some special meaning. Well, all those old regular expressions would break; they would no longer match or the old engines would no longer be able to process the new regular expressions. In addition to grep, one of the early changes is to introduce a new program called egrep or extended grep. You can actually get the same behavior of egrep inside grep by using the E option after grep.

So grep-e is essentially the same thing as egrep, and it's saying, use this new modified syntax. So we're going to now have two flavors. We have the old ones and we have the new ones. Now over time these regular expressions continue to spread. There are many programs, there are more programmers, there's more changes, so we ended up with a lot more incompatibilities. So in 1986, everyone sits down and comes up with a standard, which they call POSIX, Portable Operating System Interface--the X is just because it's in UNIX.

So POSIX is a standard that is designed to ensure compatibility between different operating systems. So the first thing that it does is it says, all right, there are going to be two different kinds of regular expressions. There's going to be BREs, which are basic regular expressions, and that's essentially what grep is, and then there's going to be EREs which are extended regular expressions, and that's what egrep is. So now all programs and programmers have to decide, are we going to try and implement them in the flavor of BREs or EREs? And it's a very clearly defined set of rules about what should match and what shouldn't match, what symbols should mean something in each one of these.

Now it's not expected that BREs and EREs ought to be interchangeable, but at least we have two clear paths forward. And BRE is really maintained for compatibility in old tools--it becomes mostly out of use--and EREs is what most modern tools are going to use. So this big effort to sort of standardize everything really does a lot of good and really gets everyone sort of all get on the same page about how regular expressions ought to work. Now at this exact same time, Henry Spencer writes a regex library that's written in the C programming language.

And what's great about the fact that it's a library is that can be incorporated into other programs, and so it provides consistency because then everyone who uses his library, their regular expressions all work the same way. So things at this point have become more consistent and the changes to regular expressions have really stabilized. In 1987, Larry Wall releases the Perl programming language. It uses Spencer's regex library, but over time, it adds many more powerful features. Perl's real mission was to try and be a programming language that was designed to be really useful, and so it added more powerful features to make it more useful.

And because of these powerful features, it really becomes the gold standard of the way that people want their regex libraries to work. Everybody out there writing other programming languages are saying, boy, I really wish mine works the way Perl did! So all of these features start to creep in. So we have Perl-compatible languages and programs that are out there. So Apache, C, C++, the .NET languages, Java, JavaScript, MySQL, PHP, Python, Ruby, all of those are endeavoring to be Perl-compatible languages and programs. There's also a library called the PCRE library that stands for Perl-Compatible Regular Expression library.

And just like Henry Spencer's library, it's a library that's supposed to add in all these extra features that Perl has. Now do you notice any similarity between Perl and most of those languages and programs that I've listed there? They're all tools that are used to build the Internet and websites. It was really the rise of the web that gave a big boost to the Perl implementation of regex, and that's where we get the modern syntax of regular expressions today; it really comes from Perl.

Show transcript

This video is part of

Image for Using Regular Expressions
Using Regular Expressions

59 video lessons · 12443 viewers

Kevin Skoglund
Author

 
Expand all | Collapse all
  1. 2m 18s
    1. Welcome
      56s
    2. Using the exercise files
      1m 22s
  2. 19m 55s
    1. What are regular expressions?
      3m 20s
    2. The history of regular expressions
      6m 40s
    3. Regular expression engines
      2m 44s
    4. Installing an engine
      4m 5s
    5. Notation conventions and modes
      3m 6s
  3. 21m 23s
    1. Literal characters
      6m 39s
    2. Metacharacters
      2m 1s
    3. The wildcard metacharacter
      4m 31s
    4. Escaping metacharacters
      4m 53s
    5. Other special characters
      3m 19s
  4. 31m 26s
    1. Defining a character set
      5m 49s
    2. Character ranges
      4m 49s
    3. Negative character sets
      4m 53s
    4. Metacharacters inside character sets
      5m 12s
    5. Shorthand character sets
      6m 30s
    6. POSIX bracket expressions
      4m 13s
  5. 36m 38s
    1. Repetition metacharacters
      7m 17s
    2. Quantified repetition
      6m 59s
    3. Greedy expressions
      6m 27s
    4. Lazy expressions
      6m 46s
    5. Using repetition efficiently
      9m 9s
  6. 20m 24s
    1. Grouping metacharacters
      4m 14s
    2. Alternation metacharacter
      4m 54s
    3. Writing logical and efficient alternations
      7m 33s
    4. Repeating and nesting alternations
      3m 43s
  7. 19m 19s
    1. Start and end anchors
      7m 21s
    2. Line breaks and Multiline mode
      4m 41s
    3. Word boundaries
      7m 17s
  8. 23m 33s
    1. Backreferences
      8m 57s
    2. Backreferences to optional expressions
      3m 51s
    3. Finding and replacing using backreferences
      7m 16s
    4. Non-capturing group expressions
      3m 29s
  9. 32m 31s
    1. Positive lookahead assertions
      6m 39s
    2. Double-testing with lookahead assertions
      7m 16s
    3. Negative lookahead assertions
      6m 10s
    4. Lookbehind assertions
      6m 26s
    5. The power of positions
      6m 0s
  10. 13m 13s
    1. About Unicode
      4m 19s
    2. Unicode in regular expressions
      4m 41s
    3. Unicode wildcards and properties
      4m 13s
  11. 1h 55m
    1. How to use this chapter
      5m 38s
    2. Matching names
      6m 33s
    3. Matching postal codes
      8m 54s
    4. Matching email addresses
      5m 0s
    5. Matching URLs
      8m 1s
    6. Matching decimal numbers and currency
      6m 45s
    7. Matching IP addresses
      7m 10s
    8. Matching dates
      7m 49s
    9. Matching times
      8m 59s
    10. Matching HTML tags
      8m 34s
    11. Matching passwords
      6m 49s
    12. Matching credit card numbers
      9m 36s
    13. Finding words near other words
      6m 38s
    14. Formatting with Search and Replace, pt. 1
      7m 22s
    15. Formatting with Search and Replace, pt. 2
      4m 15s
    16. Formatting with Search and Replace, pt. 3
      7m 10s
  12. 47s
    1. Goodbye
      47s

Start learning today

Get unlimited access to all courses for just $25/month.

Become a member
Sometimes @lynda teaches me how to use a program and sometimes Lynda.com changes my life forever. @JosefShutter
@lynda lynda.com is an absolute life saver when it comes to learning todays software. Definitely recommend it! #higherlearning @Michael_Caraway
@lynda The best thing online! Your database of courses is great! To the mark and very helpful. Thanks! @ru22more
Got to create something yesterday I never thought I could do. #thanks @lynda @Ngventurella
I really do love @lynda as a learning platform. Never stop learning and developing, it’s probably our greatest gift as a species! @soundslikedavid
@lynda just subscribed to lynda.com all I can say its brilliant join now trust me @ButchSamurai
@lynda is an awesome resource. The membership is priceless if you take advantage of it. @diabetic_techie
One of the best decision I made this year. Buy a 1yr subscription to @lynda @cybercaptive
guys lynda.com (@lynda) is the best. So far I’ve learned Java, principles of OO programming, and now learning about MS project @lucasmitchell
Signed back up to @lynda dot com. I’ve missed it!! Proper geeking out right now! #timetolearn #geek @JayGodbold
Share a link to this course

What are exercise files?

Exercise files are the same files the author uses in the course. Save time by downloading the author's files instead of setting up your own files, and learn by following along with the instructor.

Can I take this course without the exercise files?

Yes! If you decide you would like the exercise files later, you can upgrade to a premium account any time.

Become a member Download sample files See plans and pricing

Please wait... please wait ...
Upgrade to get access to exercise files.

Exercise files video

How to use exercise files.

Learn by watching, listening, and doing, Exercise files are the same files the author uses in the course, so you can download them and follow along Premium memberships include access to all exercise files in the library.


Exercise files

Exercise files video

How to use exercise files.

For additional information on downloading and using exercise files, watch our instructional video or read the instructions in the FAQ .

This course includes free exercise files, so you can practice while you watch the course. To access all the exercise files in our library, become a Premium Member.

Are you sure you want to mark all the videos in this course as unwatched?

This will not affect your course history, your reports, or your certificates of completion for this course.


Mark all as unwatched Cancel

Congratulations

You have completed Using Regular Expressions.

Return to your organization's learning portal to continue training, or close this page.


OK
Become a member to add this course to a playlist

Join today and get unlimited access to the entire library of video courses—and create as many playlists as you like.

Get started

Already a member ?

Become a member to like this course.

Join today and get unlimited access to the entire library of video courses.

Get started

Already a member?

Exercise files

Learn by watching, listening, and doing! Exercise files are the same files the author uses in the course, so you can download them and follow along. Exercise files are available with all Premium memberships. Learn more

Get started

Already a Premium member?

Exercise files video

How to use exercise files.

Ask a question

Thanks for contacting us.
You’ll hear from our Customer Service team within 24 hours.

Please enter the text shown below:

The classic layout automatically defaults to the latest Flash Player.

To choose a different player, hold the cursor over your name at the top right of any lynda.com page and choose Site preferences from the dropdown menu.

Continue to classic layout Stay on new layout
Exercise files

Access exercise files from a button right under the course name.

Mark videos as unwatched

Remove icons showing you already watched videos if you want to start over.

Control your viewing experience

Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.

Interactive transcripts

Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.

Learn more, save more. Upgrade today!

Get our Annual Premium Membership at our best savings yet.

Upgrade to our Annual Premium Membership today and get even more value from your lynda.com subscription:

“In a way, I feel like you are rooting for me. Like you are really invested in my experience, and want me to get as much out of these courses as possible this is the best place to start on your journey to learning new material.”— Nadine H.

Thanks for signing up.

We’ll send you a confirmation email shortly.


Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

Keep up with news, tips, and latest courses with emails from lynda.com.

Sign up and receive emails about lynda.com and our online training library:

Here’s our privacy policy with more details about how we handle your information.

   
submit Lightbox submit clicked
Terms and conditions of use

We've updated our terms and conditions (now called terms of service).Go
Review and accept our updated terms of service.