Jan 202016

Comments as defined by Python in 2.7.x and 3.4.x

A comment starts with a hash character (#) that is not part of a string literal, and ends at the end of the physical line. A comment signifies the end of the logical line unless the implicit line joining rules are invoked. Comments are ignored by the syntax; they are not tokens.

Seems pretty straight forward to me. Parser finds a comment token, and ignores everything to the new-line token. (Except for when implicit line joining rules are invoked.) So why was I struck with a hmmm, when I opened up a Python Console and entered a comment then pressed Enter?

>>> # this is a comment, guess what happens next?

If you look closely, the next line is prefixed with an ellipsis (…) and not a new prompt (>>>). Well there must be a reason, but this feels “unexpected.” So I pull up a 2.7 console and try it again.

>>> # this is a comment, guess what happens next?

So it looks to be intentional, although it doesn’t seem to feel correct. Ok, let’s give pypy a try and see what happens there.

>>>> # this is a comment, guess what happens next?

Ok, now I am confused. CPython treats it as an unclosed statement of some kind, although you would think it would be closed because a new-line token should have been encountered thus closing off the comment. However, when pypy gave a different and less surprising result, I put a hand to my chin and said, “Hmmm.”

Ok interweb – does anyone know what is going one here?

1) Why the unexpected ellipsis in CPython ?
2) Why does pypy not return an ellipsis ?
3) Are they both correct and are just slightly different implementations of the same reference? Or is one more correct than the other?

While I’m scratching my head, you can ponder this:

>>> # comment
... 2 + 2
 Posted by at 3:14 am

  6 Responses to “The case of the curious comment”

  1. Strange that I tried the same in ipython and the behaviour is like pypy.

  2. But:

    >>> [Enter]


    >>> [Space] [Enter]

    (CPython 2.7.11)

    I suspect CPython simply expects any non-empty input line to contain a statement of some sort, so if it can’t find one, it goes into continuation mode. And in contrast, pypy doesn’t have this requirement, so it just returns to the prompt.

    I don’t think the behaviour of the interactive interpreter is specified anywhere to this level of detail, so I wouldn’t say one is more “correct”. For what it’s worth, I find pypy’s behaviour slightly more sensible here.

    • I agree with both of your statements.

      I am developing a workshop for the new to or interested in programming and have been putting a more detailed eye on these things as I develop the course. At the start we do quite a bit in the Python console and I was working out how to demonstrate a comment. Not something I would normally do at an interpreter prompt when I encountered the behavior.

  3. Interesting find! It doesn’t happen under IPython version 4.0.0 (on top of python 3.6.0a0).

  4. My wildass guess is that the terminating endline is getting stripped before it is fed to the parser so it doesn’t detect the end of statement.

    • Toyed with it a little bit. I traced it to some logic in int tok_get(...) in Parser/tokenizer.c in CPython 2.7 branch. Here’s what the patch looks like:

      — a/Parser/tokenizer.c
      +++ b/Parser/tokenizer.c
      @@ -1249,7 +1249,7 @@ tok_get(register struct tok_state *tok, char **p_start, char **p_end)
      not passed to the parser as NEWLINE tokens,
      except *totally* empty lines in interactive
      mode, which signal the end of a command group. */
      – if (col == 0 && c == ‘\n’ && tok->prompt != NULL)
      + if (col == 0 && tok->prompt != NULL)
      blankline = 0; /* Let it through */
      blankline = 1; /* Ignore completely */

      The idea seems to be to drop lines that start with a newline or hash but for some reason was only checking the case for blank line? It’s pretty messy in there so I’m not confident this hasn’t broken anything else but it is definitely this part of the code 😉

Sorry, the comment form is closed at this time.