Regularly work smarter, not harder

Regularly work smarter, not harder
professional wizardry is tough during these times of accessible online documentation

I was working on my project the other day, when I ran into a bit of a pickle. I was in the process of balancing my programming karma: an expression I use for countering laziness and bad habits while writing programs by doing all the nice and responsible things such as proper whitespacing and writing concise descriptive comments. The task at hand was merging a few python scripts into a module file, with the help of which I could neatly call useful already-written functions from other scripts so the future me would be saved from a lot of guesswork.

After pasting 150 lines from a vital script dealing with data input to the module file, the before-mentioned vegetable dawned on me. If you didn’t already know, Python is a programming language that uses whitespace indentation for delimiting blocks instead of alternatives such as keywords (end, enddo) or curly braces (the { } symbols). This means that if I were to define a function, which was exactly what I was trying to do, all the lines of code would have to have a number of whitespace characters  – spaces or tabs – in front of them. My usual convention is using either 4 spaces or a tab with the same shiftwidth. The problem I was facing that all the one hundred and fifty lines had no whitespace in front of them after the pasting, so they were all collectively hugging the left side of my text editor! Due to some moodiness on ParaView‘s behalf or perhaps the position of Saturn in relation to Mars, I recall that I absolutely needed 4 spaces in front of each one of those lines.

i dont like calculators

It was apparent to me that I was going to either have a really, really long afternoon, hire a secretary or find a way to work smarter. Luckily, I was already familiar with the concept of regular expressions or in short, regex.

dont try this at home

A regular expression is a sequence of characters – numbers, letters, even symbols – that defines a statement which is interpreted as a search pattern. Such a search pattern is mainly used for matching strings – a simple example of an operation like this is a “find and replace” tool found in almost all text editing software, which almost everyone and their cat has used to fix something like the spelling of ‘Nietzsche‘ in that 10 page long essay on philosophy back in high school. However, simply replacing words with other words was not going to be too helpful to me, so I had to get a bit craftier – what I needed was a more powerful regular expressions interpreter. Numerous text editors, word processors and search engines support regular expressions in some form or another; many programming languages such as Perl, Python, Java, C++ have either built-in or standard library-provided regex functionality. Since I almost exclusively use Vim out of personal preference (except when writing essays on Nietzsche), I already had access to an interpreter with its sed-like regex syntax. The previous picture is a simple example of such syntax. Also, for example using the re library in Python would’ve just been needlessly over-complicating things (especially the Asimov-esque fact that I would be writing a Python program that would write me a Python program).

At this point I must mention that I was not a complete stranger to regular expressions at that time and that the whole process of adding 600 spaces took less that 5 seconds in total.

p5p6

What I did wasn’t quite wizardry though: I just selected the 150 lines of code that were bothering me and wrote the regular expression:

s/^/    /

The expression can be broken down into 3 parts. The first one would be the overall structure “s///”, which simply stands for “search/for this/replace it with this/”.

The second important part is the ^-symbol, which is the non-obvious part of the expression. In this context, the symbol is called a metacharacter, which by standard stands for the beginning of the string, or when working with lines, simply the beginning of the line. Since I had selected a great number of lines of code, it matched the beginning of each of those lines. Great – that’s exactly where I wanted to add spaces to.

The third part of the expression is what is between the last two “/” symbols – as you may (or may not) see, that’s where I have written 4 spaces. And the regular expression in whole does exactly what you might expect it to do – it adds 4×150 spaces to my Python script, 4 to the beginning of each line, neatly spacing it to where I want it to be!

This was just a small example of how using a powerful tool such as regular expressions can make one’s life much easier. Assembling a list of useful features and examples would take a ridiculous amount of space – I will just draw attention to the fact that there is a countless amount of tutorials and online documentation available, such as this great website.

professional wizardry is tough during these times of accessible online documentation

professional wizardry is tough during times of easily accessible online documentation

To close off, here is a small riddle similar to an expression I used just a couple of days ago: what does the following (surprisingly simple) expression do and why?

s/\[\/\]/\[\\\]/

About

Undergraduate physics student at Tartu University.

Tagged with: , , , , , , ,
6 comments on “Regularly work smarter, not harder
  1. Hannes says:

    Great post! In vim, this expression seems to replace

    [/]

    by

    [\]

    You need the \ in front of the [ to tell the program to look for the actual character “[“, am I right?

    • Jasper says:

      That’s right. The more seemingly confusing part of the expression is the fact that you need to escape the slash aswell (in addition to the metacharacters [ and ], since its an editor command in sed), hence the alien-looking “\/\”-combination in the first set of braces.

      • Nick Geoghegan says:

        It’s not confusing that you need to escape the slash, since slash is a delimiter.

        So you need to escape the delimiter… with the delimited!

        Now where’s xzibit?

  2. Luna says:

    Great post! I really like your drawings! 🙂

    Let me tell you another way to do it, easier for beginners and more intuitive:

    All vim users know at least how to write (pressing “i” to insert and then “Esc” when they finish). So, instead of selecting all the lines (Shift + v) you can select a block (Ctrl + v):

    when you are in the first position of the line, press “Ctrl+v”, then with the arrow go down until the last line you want to add the spaces; when everything is selected you have to press “Shift+i”; add the spaces (they will appear in the first line that you selected); now press “Esc”… after that the spaces will appear in all the lines magically!

    (If you pressed just “i” and not “Shift+i” just the first line will be changed)

    Congrats again for the comic! It is fantastic!

  3. Karl says:

    Regex gets a lot easier to read if you simply use a different delimiter, for example : to separate your patterns.

  4. I’m not sure where you are getting your information, but good topic.
    I needs to spend some time learning more or understanding more.
    Thanks for magnificent information I was looking for this info
    for my mission.

Leave a Reply to Hannes Cancel reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.