Why were '^' and '$' used as symbols to represent the beginning and end of a line in regular expressions?



Software engineer Hillel Wayne posted on his blog about the history of why the symbols ``^'' and ``$'' were used as symbols to indicate the beginning and end of a line in regular expressions.

Why do regexes use `$` and `^` as line anchors? • Buttondown

https://buttondown.email/hillelwayne/archive/why-do-regexes-use-and-as-line-anchors/



Regular expression is ``a method of expressing a set of strings as a single string.'' For example, using one string ``/^G.*/'', you can express ``all strings starting with G.'' You can express a collection of many strings like this. Among the characters used in such regular expressions, '^' matches the beginning of a line, and '$' matches the end of the line.

These two characters first appeared in Ken Thompson's port of the QED text editor . The original QED editor did not have regular expressions and did not use '^'. On the other hand, '$' in the original version was supposed to point to the last line in the buffer. Mr. Thompson adjusted the meaning of '$' and brought it to regular expressions.

Here, Mr. Wayne tried to find out why the QED editor adopted '$' as the end of the buffer. According to the QED editor paper, the QED editor was developed for the SDS-930 mainframe, and according to Wikipedia, the SDS-930 mainframe used the Teletype Model 35 as an input device.

Mr. Wayne obtained a photo of the Teletype Model 35 from a sales brochure. If you look closely at this photo, there are no symbols such as '[]{}\|^_@~'. Of the remaining symbols included, '$' seems to be the least useful.



'$' is an important symbol that all typewriters should have in business, but in programming it was a meaningless symbol that had no other meaning than 'dollar'. Mr. Wayne prefaced this by saying, ``I don't have a clear answer and am not satisfied,'' and speculated that it was against this background that he adopted the $ symbol.

On the other hand, '^', which indicates the beginning of a line, was not included in Teletype Model 35, and therefore was not used in the QED editor. However, since '^' existed on Mr. Thompson's keyboard, Mr. Wayne speculates that Mr. Thompson adopted '^' as a symbol to mean the beginning of a line.

◆Forum now open
A forum related to this article has been set up on the GIGAZINE official Discord server . Anyone can write freely, so please feel free to comment! If you do not have a Discord account, please create one by referring to the article explaining how to create an account !

• Discord | 'When do you use regular expressions?' | GIGAZINE
https://discord.com/channels/1037961069903216680/1224647243441635349

in Software, Posted by log1d_ts