This is Taylor R. Campbell's blag. -*- outline -*- Copyright (c) 2006--2013, Taylor R. Campbell. Verbatim copying and distribution of this entire article are permitted worldwide, without royalty, in any medium, provided this notice, and the copyright notice, are preserved. 2013-03-21 Fun with C (or, @!*#%!#^&!^#$^@!%#) Consider the following fragment of C code with the type T defined to be an integer type: uint64_t x = UINT64_C(0x0123456789abcdef); T n = 64; printf("%016"PRIx64"\n", (x & ~(n - 1))); Say we're on amd64, with sizeof(char) = 1, sizeof(short) = 2, sizeof(int) = 4, and sizeof(long) = 8, and two's-complement arithmetic -- nothing exotic or pathological here. Without running the code, can you guess what it prints for each of the following definitions of the the type T? typedef uint8_t T; typedef int8_t T; typedef uint16_t T; typedef int16_t T; typedef uint32_t T; typedef int32_t T; typedef uint64_t T; typedef int64_t T; Hint: It's not 0123456789abcdc0 in every case. 2012-10-31 Moving mathematics forward I have made a tremendous breakthrough tonight in the greatest open problem of all mathematics, the Riemann hypothesis, by finding an example of the elusive Riemann zeta pumpkin. Based on my extensive research into the literature on the subject[*], I believe I am the first to have come across such a beast. Unfortunately, I don't know how to talk to the camera with which I catalogued my evidence of this critter, so proof by photography will have to wait until I figure out how to talk to a USB PTP or MTP device in NetBSD. [*] 30 seconds of browsing Google image search results. [Update, 2012-11-02: I tried gphoto2 and libmtp, but neither worked. gphoto2 can identify the device (`gphoto2 -a'), but everything else fails with a `bad parameter' error or something. libmtp can't even seem to identify that there is a device. ktrace just shows that ioctl(USB_DO_REQUEST) fails with EIO or something; I think I need to dig into the USB layer to find out what's wrong. I'll add more details when I have physical access to the device again.] [Update, 2012-12-02: Here's a photograph: http://mumble.net/~campbell/images/riemann-zeta-pumpkin.jpg I'm afraid I cheated, though.] 2012-10-17 What's an epigraph? An anonymous interlocutor asked me to define the word `epigraph' some months ago. An epigraph's a witty pith, or perhaps a pithy wit, That you could start a chapter with to prime the reader's mind a bit. 2012-10-04 My verse may be worse but your prose hurts my nose. (dribblings from a late night when bad poetry was coming out like a fit of sneezes) 2012-08-24 What's wrong with Python I just found this Python code in the wild: f = open(pathname) try: x = foo(f) finally: f.close() process(x) This is wrong at so many levels I don't know what to say. And it wouldn't be fair to blame whoever was responsible for putting the code in this state, because this is very nearly the right way to write it in Python, and Python will happily accept this in spite of its insensibility. 2012-05-23 Energy and computation Critics have noted that the premise in _The Matrix_ of using the heat of human metabolism as an energy source, the energy premise, is hopelessly contrived and detracts from the film. And, indeed, it is hopelessly contrived, but this doesn't bother me because I have a much better, more plausible premise to replace it -- and a reason why they would never have used the better premise in the film. Instead of the energy premise, I take a computational premise: what the machines exploit the humans for is not energy but computation -- human brains are very good at learning quickly and optimizing approximate solutions to NP-hard problems. That way, the machines don't need to reverse-engineer the human brain or develop (`learn') approximation algorithms of their own. The computational premise, though, doesn't lend itself to a cute image of Morpheus holding up a D-cell and saying `...turning humans into this.' The energy premise is easy to explain, if contrived; whereas it would be a hopeless cinematic blunder to attempt to explain the computational premise to a general audience in Morpheus's brief monologue to Neo. Thus, I am satisfied with a better premise for the film, and forgive the creators for not using it. 2012-05-21 2012-05-17 Metre matters more to me than rhyme. Teeter tatters tore to tea, then thyme. ...wait, does that sound rhyte? 2012-05-13T07:34:55Z Is a poem like a joke in that if the meaning must be explained, then the value is lost? 2012-04-24 Mail Mail is hosed. Unreliable queues, mail stores without integrity checks or sensible indexing, broken flags, hopelessly confused MUA designs, and more... Argh! 2012-03-15T13:25:46Z Random `Fix random kernel memory corruption by algo_doublehash(). And by "random" I don't mean just "arbitrary" as in using an uninitialized pointer, but random as in corrupting the contents of memory addresses chosen using a crypto-strength random number generator.' 2012-03-15T06:23:47Z Beware the IDEs of March Beware the IDEs of March, my son -- The tools that bite, the builds that botch. Beware Eclipse, Emacs, and shun The kludgious Binfordsnotch! ...what? 2012-01-22 Laptop I have been asked a few times what laptop I chose to buy last year. The answer is that I didn't. The laptops I was most strongly considering were the Thinkpad W510 and the Sony Vaio Z. But I got fed up with shopping; instead I got a new disk for an old MacBook1,1 that I had lying around, and fixed some bugs in NetBSD to make it run on that. 2011-12-19 The FM Principle It often amazes me to marvel at just how much of computing is founded upon the FM principle. Woodpeckers. 2011-10-29 Antediluvian barmy curmudgeons! Desolate ergatives fuming their grudgeons! Heckling isthmus of jocular kerosene! Lepidopteric morose nitrotoluene! Peripatetic Orphean qwertigial Runcible syrup of toad urns vestigial! [If you comment on this article, please be sure to mention bananas.] Winkle-picked xanthines, ytterbium seder of Zoölogues, zits, Zapatistas, &c.! 2011-08-{29,30} paredit patches There's a pile of patches potentially for paredit 23 at . Try them out and let me know what they break, or, in the unlikely event that they don't break your whole editing feel, whether you like them, and what cases you especially like in them. That way, I can write down the cases so they won't be broken again in the future. 2011-08-29 Crypto library I am thinking of writing a crypto library to complement djb's well-known one by providing some bits he omitted. I will call it 1-[5-(1,3-benzodioxol-5-yl)-1-oxo-2,4-pentadienyl]piperidine. 2011-05-09 FOCUS THEFT SHOULD BE A FELONY !@*#^$@!&*#^@%!*#@$^%!#&*@!% 2011-03-32 New job, project status I have accepted a position at Microsoft in the finance department, working on in-house tools for optimizing Microsoft's tax structure. I am moving to Redmond; I fly out tonight. Anyone in the Seattle area know any excellent seafood restaurants? My new manager has asked me to port paredit to Visual Studio so that we can use it in-house, and since I have only limited time for personal projects, I shall be dropping support for paredit.el in GNU Emacs (and paredit.scm in Edwin). Today I completed and committed a long-running project to switch MIT Scheme to use dynamic scope by default, in preparation for the upcoming Scheme standard produced by the working groups; and I also committed my (incomplete) patches to change Edwin's default key bindings to match modern vi conventions. 2011-02-24 Paredit on old Emacs Does anyone still use paredit on XEmacs, or on GNU Emacs 21? 2011-01-17 Laptop I am looking to buy a new laptop. But I am picky. At I wrote down a lot of constraints. Can anyone help me to satisfy them? 2011-01-14 PGP I'm pretty frustrated with GnuPG. Straightforward operations (decrypt a message -- no, really, just *decrypt* it) are not straightforward to perform with the gpg utility. There are half a bazillion knobs controlling gpg's behaviour so that straightforward command lines do not have straightforward semantics. And half the operations can be done only interactively -- by which I mean not just that it prints prompts, but that it prints them to /dev/tty and reads input from /dev/tty. It's not obvious what the gpg utility is actually doing whenever I use it, especially when it prompts me for a pass phrase to decrypt one of my keys to make a signature without telling me what it's signing. Just about everything keeps state in several different files in one's GnuPG home directory; it's not clear what operations are actually stateless. There is no good introduction to the meaning and use of the trust model, or any coherent or interactive picture for it. The output of `gpg --list-keys', `gpg --list-sigs', `gpg --list-packets', and so on, is confusing on multiple levels. Here are three things I'd like to see in an OpenPGP implementation: 1. A collection of utilities to perform straightforward operations straightforwardly. Examples: a utility to generate an RSA key pair; a utility to decrypt a packet given the Elgamal private key; a utility to make each particular type of signature; &c. 2. A packet inspector. This could be graphical (maybe Gtk+), or it could be embedded into Emacs. It would replace `gpg --list-packets', and allow the user to interactively explore the packets, see all the information stored therein, decrypt messages, verify signatures, and so on. Think Wireshark for OpenPGP. 3. An interactive truth maintenance system. Fundamentally, a PKI is a system for distributing and interpreting signed assertions, such as `J. Random Luser is the owner of key 1234abcd', signed with key 1234abcd. An OpenPGP key ring is a pile of public keys and signed assertions. The cryptographic signatures make the assertions as plausible as they would be if made in person, but we need a computer to handle the crypto and draw plausible conclusions from the premises we specify. But we may find some of the conclusions suspect; in that case, we want to ask why the computer drew those conclusions, and what premises we must reject if we want to reject the conclusions. Or we may decide that we're not happy with some of the premises, and we'd like to see what conclusions lie on shaky grounds -- we'd like to see what conclusions we give up if we reject the premises. These are exactly the functions of an interactive truth maintenance system. The GnuPG trust model is essentially a set of inference rules, which could be used with such a truth maintenance system. But gpg provides no tool to explore the `web of trust' that its trust model induces; it just has an implementation of the engine to infer conclusions given some premises. Perhaps some day I'll actually write some of these, or find that somebody already has. P.S. It's hard to google anything about PGP. Google turns up `----BEGIN PGP SIGNED MESSAGE-----' a lot more than it turns up useful results. P.P.S. I have glanced at netpgp, a relatively new implementation of OpenPGP based on openpgpsdk that will probably be part of NetBSD 6.0. Its behaviour does not appear substantially different from that of GnuPG, though, other than that netpgp is less polished so far. 2010-11-04 Gtk+ In a Gtk+ program I use regularly, while entering text, I often accidentally hit C-w out of habit from Unix to kill the preceding word. Instead of killing the word, this closes the window, to my interminable frustration. It's not as bad as C-M-DEL zapping X when I want to kill the preceding S-expression in Emacs, but it is frustrating enough that I sought to fix this. It turns out that if you put `gtk-can-change-accels = 1' in your ~/.gtkrc-2.0, hover over the menu item to which the key is bound, and hit DEL, then the key binding goes away. Is this documented? Yes! Well, no. Well, sorta kinda, maybe? Actually, not really. When I googled `"gtk-can-change-accels"', the first hit I got was from gtk.org, and appears to be the only actual documentation relevant to the subject. All the other hits were from mailing list archives, forum posts, blug posts, and FAQs from various applications, saying that one can use the option to do such and such. So what does this actual documentation say? Well, first of all, it's API documentation, not remotely fit for user consumption. And here's all that it has to say about the option: `Whether menu accelerators can be changed by pressing a key over the menu item.' Does it mention hitting DEL? No. Does it say how to quote DEL, in order to bind it to an action? No, of course not! Does it mention ~/.gtkrc-2.0? No. In fact...is there any documentation at all for ~/.gtkrc-2.0? No! Not even API documentation! You just have to know about it! Googling it just turned up more mailing list archives, forum posts, blorg posts, and FAQs with tips about how to do such and such by frotzing your ~/.gtkrc-2.0 file... 2010-10-09 Paredit 22 Paredit 22 is released: Not all the patches got incorporated into it. Patches potentially for paredit 23 are in . 2010-09-20 Paredit 22 beta I have merged the aforementioned patches into a new paredit 22 beta, at as usual; lists the changes as the output of `darcs changes' on the Darcs repository at . Nobody has opined about a paredit mailing list or offered to administer one for me, so I presume there's no interest. 2010-01-01 Paredit patches, still Not enough users have given me feedback on the patches to paredit in . Thus, paredit 22 has not yet been released. See the (chronologically) previous (but textually next) entry. [Update: Since the time of writing, the aforementioned entry has self-destructed.] Should there be a mailing list for paredit? Contact me if you have an opinion on this question, or would like to offer to administer such a mailing list. 2009-03-28 More simply addressing the confusion about UNWIND-PROTECT in Scheme Suppose you want to release a resource after a part of a program that uses the resource. For example, consider a connection to a database that should be disconnected, a file descriptor that should be closed, or a buffer for sensitive data such as a pass phrase that should be destroyed by zeros or random data. Once the resource is released, it can be neither reused nor readily reconstituted. Is the resource represented in Scheme by a single object? If so, ensure that with the object is registered a finalizer that releases the resource, so that even in the face of an interactive ^C at the REPL the resource will be released. All of the above examples qualify. (define (open-database-connection database-descriptor) (let ((handle (allocate-database-handle))) (register-finalizer handle %close-database-connection) ;; After this point, if we are interrupted -- say with ^C at ;; the REPL --, somewhere the handle will be recorded so that ;; its resources can be released even if the usual logic to ;; close the connection is never reached. (%open-database-connection database-descriptor handle) handle)) Is it /necessary/ for the resource to be released immediately after the part of the program that uses it? If so, use REWIND-PROTECT at . In this case, control can never re-enter the extent that uses the resource. Of the above examples, only a buffer for sensitive data qualifies. (define (call-with-pass-phrase prompt receiver) (let ((buffer (make-pass-phrase-buffer))) (rewind-protect (lambda () (prompt-for-pass-phrase prompt buffer) (receiver buffer)) (lambda () (clear-pass-phrase-buffer buffer))))) Is it /desirable/ for the resource to be released immediately after the part of the program that uses it? If so, use UNWIND-PROTECT at . Of the above examples, both the database connection and file descriptor qualify. (define (call-with-database-connection database-descriptor procedure) (let ((handle (open-database-connection database-descriptor))) (unwind-protect (lambda () (procedure handle)) (lambda () ;; If a continuation is captured inside PROCEDURE so that ;; this cannot execute immediately when PROCEDURE returns, ;; then closing the database connection here is redundant, ;; since registered with HANDLE is a finalizer that will ;; close the connection. However, provided that the ;; finalizer have no effect if the connection is already ;; closed, this redundancy is harmless and enables ;; UNWIND-PROTECT to close the connection immediately if ;; no such continuation has been captured. (close-database-connection handle))))) Rationale: Scheme's CWCC (or composable continuations) enables powerful control abstractions such as inverting control. For example, (FOR-EACH ) applies to each element of in sequence. With CWCC, one can reify the iteration of the list by FOR-EACH into a first-class cursor object that can be advanced independently. This way, for example, we can traverse multiple lists in parallel using only the one-list FOR-EACH procedure. If we rather consider a procedure (FOR-EACH-QUERY-RESULT ), the situation is not much different, except that FOR-EACH-QUERY-RESULT may entail the opening of a connection to a database, and the connection is a resource which should be released after the query results are finished being received. Using CWCC to invert control of FOR-EACH-QUERY-RESULT is just as valid as using CWCC to invert control of FOR-EACH, but if FOR-EACH-QUERY-RESULT uses REWIND-PROTECT (a common idiom in unfortunately much Scheme code), it inhibits control inversion. UNWIND-PROTECT allows the control inversion. Why involve finalization in UNWIND-PROTECT, rather than just associating a finalizer with the object representing the connection to the database? The difference is in the intent implied by the use of UNWIND-PROTECT that if control cannot re-enter the protected extent, the protector can be immediately applied when control exits the protected extent extent even if no garbage collection is about to run finalizers. Note that one cannot in general reconcile (1) needing to release the resource immediately after control exits the extent that uses it, and (2) enjoying the benefits of powerful control abstractions such as inversion of control. However, if (1) is relaxed to (1') needing to release the resource immediately after control /returns normally from/ the extent that uses it, then one can reconcile (1') and (2) by writing (CALL-WITH-VALUES (LAMBDA () ) (LAMBDA RESULTS (APPLY VALUES RESULTS))), or simply (BEGIN0 ) using the common name BEGIN0 for this idiom. (In Common Lisp, this is called PROG1.) Also note that condition systems are not involved; they are a red herring. Any condition handler that releases a resource should do so only when the condition implies that the resource is no longer usable; for example, if a database connection becomes desynchronized somehow and cannot proceed, then it is reasonable for the handler for this condition to close the connection and destroy any of its state. However, when handling conditions that do not imply that the resource is no longer usable, the resource should not be released. An enclosing handler may continue within a protected extent, or even exit and re-enter the protected extent, for instance while iterating over a reified database cursor procured by control inversion. Handling any condition by releasing the resource defeats these useful operations. At is an implementation of UNWIND-PROTECT for MIT Scheme. This definition is lighter-weight than the portable one. However, it does not attempt to run the protector immediately if it can. Appendix A: John Cowan has asked for elaboration on the inversion of control for iteration with FOR-EACH. (define (list->stream list) (iteration-procedure->stream (lambda (receiver) (for-each receiver list)))) (define (iteration-procedure->stream iteration-procedure) (lazy (call-with-current-continuation (lambda (return-stream) (iteration-procedure (lambda (element) (call-with-current-continuation (lambda (return-to-iteration) (return-stream (stream-cons element (lazy (call-with-current-continuation (lambda (return-cdr) (set! return-stream return-cdr) (return-to-iteration)))))))))) (return-stream stream-nil))))) ;;; Simpler definition with reset & shift. (define (iteration-procedure->stream iteration-procedure) (lazy (reset (begin (iteration-procedure (lambda (element) (shift continue-iteration (stream-cons element (lazy (continue-iteration)))))) stream-nil)))) Appendix B, discussion question: Should UNWIND-PROTECT evaluate protection forms in the dynamic context of the call to UNWIND-PROTECT? This rules out using UNWIND-PROTECT within REWIND-PROTECT, which may be an onerous requirement. 2009-03-21 On file systems, Unix, and brain damage [updated] Earlier this month, users of the alpha version of Ubuntu, the first to support the ext4 file system by default, reported an immense increase in data loss after crashes leading to corruption of their desktop environments. After a crash, many files that were recently written were truncated on disk to zero bytes.[1] How did this happen? To atomically replace a file's old data with new data, applications write the new data to a file linked at a temporary pathname, and then rename the link to the permanent pathname, overwriting the link to the file with the old data. This sequence of operations, with the atomicity guarantee of the POSIX rename() function, guarantees that at any time, at the permanent pathname will be a link either to a file containing the old data or to the file containing the new data. This guarantee is made, at least, during normal system operation. POSIX guarantees nothing when the system crashes or the power fails. If the system is even bootable after such an event, POSIX permits it to have replaced the contents of the file system with a collection of cat macros. But while this behaviour is permitted, it is not useful. System crashes and power failures are facts of life, and a well- engineered file system should limit, not exacerbate, their damage to user data. In the name of performance, the ext4 file system as it was bundled with the alpha version of Ubuntu would delay the allocation of blocks on the disk for a file so that they can be allocated contiguously together when needed. Periodically the system forces allocation of blocks and commits them to the disk; by default, this happens on sixty-second intervals. But the ext4 file system would sometimes commit renames to the journal before allocating blocks for the new data, during a sixty-second window. If the system crashes, and the journal is replayed, then the rename may be applied -- before the new data are written to the disk. This effectively reorders the sequence that many applications rely on to atomically replace files, destroying its atomicity. The developers of the ext4 file system answered with several `solutions': [2], [3] (1) make your systems more reliable with UPSs and by avoiding buggy proprietary drivers that cause such frequent crashes, which the developers of the ext4 file system don't encounter; (2) use SQLite and other databases instead of writing files to disk; (3) change all the applications that do write files to disk so that they fsync before closing and renaming; and if that becomes a performance problem then (4) run fsync periodically in a separate thread. Solution (1) is no excuse for a reliable file system, and is a dangerous sentiment to be held by the developer of a file system likely to find widespread use in personal computers running Ubuntu, computers which may not have the resources available in data centres for reliability. This suggests that the developers of the ext4 file system have never seriously stress-tested it with forced power cycles during heavy I/O loads! About solution (2), and the general feeling of `we like databases!', I can only refer to jwz's rant about mbox summary files. Technical merit aside, solutions (2) and (3) are impractical because there are simply too many programs to change. Solutions (3) and (4) are disturbingly misguided. Even if it were practical to change every application's write/rename sequence into a write/fsync/rename sequence, to do so would be semantically wrong. Only programs that require a file's data to be written to permanent storage before proceeding should use fsync; this is what fsync does. For example, editors such as Emacs fsync files before informing users that the files are written to disk; users are unhappy if the files that their editors told them were written to disk were not. Mail receiving agents fsync files where they store mail before telling the sending agents that the mail has been accepted; users are unhappy if they are told that mail was delivered when it was not. Programs that require only that the replacement operation be atomic, and that it not leave a file in a corrupted intermediate state, should not use fsync, because the purpose of fsync is to provide a guarantee which these programs do not need. For example, the KDE window configuration is not critical user data, but may be stored on the file system. It is OK for the machine to forget that a user has dragged a window a few pixels, but it is not OK for KDE to read a corrupted intermediate state of the file between the old and new versions -- hence atomic write/rename replacement without fsync. Extraneous fsyncs, in fact, degrade performance, especially for small files that are frequently overwritten -- it gives the file system no opportunity to keep the intermediate versions solely in memory. Calling fsync periodically in a separate thread is nonsense -- if a program cannot proceed until a file's data are on permanent storage, it should call fsync and wait; if a program can proceed, then it shouldn't call fsync. The operating system already periodically synchronizes the file system with permanent storage -- at least, one would hope so; if it doesn't do this, then it has a bug. Actually, fsync does not guarantee that a file's data are written to permanent storage before proceeding. POSIX (or at least the SUSv3, ) claims that it shall: The fsync() function shall request that all data for the open file descriptor named by fildes is to be transferred to the storage device associated with the file described by fildes. The nature of the transfer is implementation-defined. The fsync() function shall not return until the system has completed that action or until an error is detected. Many man pages echo this sentiment. For example, NetBSD's fsync(2) man page begins with: fsync() causes all modified data and attributes of fd to be moved to a permanent storage device. But on AIX, Darwin, Linux, NetBSD, and probably other Unices as well, fsync does not actually guarantee that the file's data are written to permanent storage. Instead it only nudges the file system to force some blocks associated with the file slightly sooner. If the file system is backed by a hard disk device that has a write cache, the blocks might just end up in the write cache -- which is volatile, and which may evaporate in the case of a power failure. Common protocols by which operating systems communicates with storage devices have commands to request that the storage devices force their write caches to permanent physical media. Some devices ignore these commands, in which case one cannot win, but the commands can be sent nevertheless, and this is the best effort that the operating system can make. And AIX, Darwin, Linux, and NetBSD all provide ways for programs to request this -- different ways, that is. AIX and NetBSD provide a system call fsync_range, which has a flag FDISKSYNC (NetBSD) or O_DSYNC (AIX) that requests any blocks associated with the file to be forced to permanent physical media. Linux provides a fancier but similar system call sync_file_range. Darwin provides an fcntl command F_FULLFSYNC. These approaches are all slightly different, and for file descriptors not backed by permanent storage devices they all report different errors. So just to force a file's data to disk reasonably reliably on at least four different Unices and to ignore the cases of file descriptors not backed by permanent storage, it takes a good seventy lines of code littered with cpp conditionals, together with autoconf or autoconf-like detection of the relevant system calls. Ugh! Fortunately for the hapless users of Ubuntu, the developers of the ext4 file system have patched it so that renaming a link to a file that overwrites a link to another file forces any pending data for the first file to be sent to the disk before the directory structure is updated to reflect the rename. This may not guarantee that the data are actually, really, truly on permanent physical media, but it's better than waiting sixty seconds before they are even delivered to the device that will eventually write them to permanent physical media. But this kerfuffle has not inspired confidence in me about the ext4 file system. In [3], the developers of the ext4 file system make such absurd suggestions as making a laptop mode in which the kernel ignores fsync, so that even those applications that use fsync semantically correctly will not have the very guarantee that is its sole reason for existence! When did performance become a concern overriding robustness in file systems? [1] http://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781 [2] http://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/45 [3] http://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/ [Update, 2009-03-26: I have been pointed to Linus Torvalds' ideas on the subject, archived at . I hope that the ext4 developers take heed of his sentiment.] [Update, 2012-01-22: All file systems are hosed. Some are just more obviously hosed than others. Argh.] 2009-03-21 Correction to the optimization of resource release in control brackets Due to an unstoppable deluge of the concern of readers after the 2009-02-04 correction, I am obliged to correct the optimization in the original 2006-05-01 entry. The correct optimization is to associate a flag with each wind point which is by default clear and which will be set by CWCC or by any procedures that reify composable continuations. (This is not an onerous requirement: invocation of the reified continuations must run in time linearly proportional to the depth of the wind point tree anyway.) If the flag is clear when the after thunk is called, then the finalizer can be run immediately. 2009-02-04 Correction to control brackets and resource release It has come to my attention that I identified the wrong criterion for when protector thunks established by UNWIND-PROTECT may be called. The criterion I stated is that control may never again return from the protected thunk. The correct criterion is that control may never again re-enter the protected thunk. (Consider, for example, a coroutine that does not terminate and return.) The correct definition for *UNWIND-PROTECT in the appendix is: (define (*unwind-protect thunk protector) (let ((protector-cell (make-protector-cell protector))) (dynamic-wind (lambda () (identity-procedure protector-cell)) thunk (lambda () unspecific)))) In Scheme systems that provide an REGISTER-FINALIZER procedure, which requests the garbage collector to apply a procedure when a given object's storage is reclaimed, one might write: (define (*unwind-protect thunk protector) (let ((object (cons 0 0))) (register-finalizer object protector) (dynamic-wind (lambda () ...refer to OBJECT to prevent reclamation of its storage...) thunk values))) 2009-01-04 I forgot about the 2007-mm-dd entry. I'm not sorry. 2007-mm-dd On macros and expression in Scheme [Placeholder for a long-promised entry which I haven't yet forgotten about.] 2007-08-23 On the Theatrical Security Administration's cleverness During my flight to Seattle one month ago, my case containing a guitar was `randomly' selected to be searched, and the TSA left a gift for me inside: an Orwellian notice about protecting me and my fellow passengers by randomly violating our constitutional rights (which might, they add, involve accidentally damaging the bags, with no liability on the TSA). Flying back to Boston, I left a prominent copy of the Fourth Amendment to the US Constitution in the same case -- specifically, in the body of the case. When I opened up the case this morning, I found no notice from the TSA, but the slip of paper on which was written the Fourth Amendment was tucked into the very top of the case, where the headstock of the guitar goes. The paper does not fit through the neck of the case; it could not have slid there by its own volition, even when tossed by the roughest of baggage handlers. I'll shortly discover whether the TSA's X-ray machine affects ASA 400 film, too. The TSA goon who demanded to put my camera and its film through that machine did let pass an unintentionally helpful hint, though: to avoid damaging any film, just tell the TSA goons that its speed is ASA 1000! 2007-07-16 On debugging in Scheme [updated 2046-03-32] The two Scheme systems that I use primarily these days are MIT Scheme and Scheme48 . Both have decent debuggers that work for most cases. Both have facilities for limiting output to avoid excessive output for deeply nested data structures, or infinite recursion in the printer for circular data structures. Both have rudimentary but working mechanisms for examining pretty much all data in the system. (I can make objects that Scheme48 and MIT Scheme fail to inspect, but most people can't, and this is an uncommon operation.) Let me make this clear right now: these are rudimentary tools. They have some minor niceties (for instance, the usual mode of operation in MIT Scheme's debugger is to have a REPL within every stack frame), but these minor niceties are not especially profound. Rudimentary as these tools are, though, they *work*, and they suffice to examine and debug any reasonable program. And MIT Scheme and Scheme48 are nearly unique in the respect of having general, if rudimentary, debugging tools. This baffles me. This flummoxes me. This flabbergasts me. (No, this is not something new; this is something that I have been aware of for a long time, and privately fumed about.) Today I wanted to debug a program in Gambit. Gambit, too, has a rudimentary debugger. Yet it completely fails in two prominent ways: (1) There is no limitation to the output at the REPL. (The values of local variables on stack frames are printed truncated, but that doesn't help if I want to inspect them more thoroughly.) Although it is possible, with a bizarre incantation about *read* tables (boggle) to protect the *writer* against circular output, there is no apparent provision for limiting the depth of the output, and the default is for the writer and pretty-printer to lose horribly with infinite recursion until you hit C-c. And the default format for printing records involves recursively printing all of the records' components. Graphs? You lose. (2) There is no object inspector by which to interactively inspect the components of objects without having to grovel through megabytes of output from an infinitely recurring writer. No, there is not even a way to inspect the local variables of a closure found on the stack. In fact, I couldn't even find a way to look at the condition that was signalled. And Gambit is better than most other Schemes. Chicken will print a spare trace of call frames that were on the stack. That is all. It does not even show the nesting of the frames; garbage frames that have already returned, but are retained on the stack because of Chicken's architecture, are shown along with everything else. And the stack is traced *after* the error message is printed, so if the error message printed a complex object, what the poor, deprived programmer will see as a trace of the call frames on the stack is procedures used to print the error message -- information that is completely irrelevant to the actual problem. I wrote an object inspector for Chicken (the `dissector'), but it's useless because (1) records no longer contain any introspective information, and (2) it is impossible to get at the objects in call frames on the stack. DrScheme will give a helpful display of a flurry of arrows covering the whole source file in question, so that programmers can see by what flow of control they arrived at the error. Flow of data? What's that? If you plan to create a new implementation of Scheme which you mean to be useful to other humans, *please* implement a general debugger and object inspector. The state of debugging tools in Scheme is absolutely appalling and utterly inexcusible. (I'm told that Chez has a general, if rudimentary, debugger as well; because Chez is commercial and proprietary, however, and the free software Schemes out there meet my needs far better than Chez does anyway, I don't have a licence for Chez with which to experiment with its debugger.) 2007-07-15 On the public domain, licence terms for software, and practical legal choices I am not a lawyer. This is not legal advice. I have not pursued legal studies, and have no background in civil litigation, copyright law, or contract law. (The terminology that I employ in this article will without a doubt induce any lawyer to cringe.) For that matter, I'd rather devote my time to hacking than to burying my head in legal studies for decades: Scheme is a much nicer language for expressing ideas concisely and clearly than English, but unfortunately our laws are expressed in English. It is precisely because I am more interested in hacking than legal analysis, in fact, and because I should prefer never to worry about embroilment in courts of law as a consequence of careless decisions, that I write now. For the past couple of years, I have been releasing my publicly available software into the public domain. I ought to be precise, however: I have been releasing said software with a message claiming that it is in the public domain. I ignored remarks equivocating the notion that any works can be released into the public domain, my reason being that it was simply absurd for an author to be unable to forswear any and all claims of copyright: surely, if an author has no intention of claiming copyright, she ought to be able to relinquish the right of doing so. And the simplest status for works lacking copyright claims is release into the public domain. Not only did I place message stating that the code was in the public domain, but I also added a disclaimer of all warranties -- a rather inept and negligent disclaimer, which read `All warranties are disclaimed.'. My reason for this was that I preferred to avoid cluttering all of my files with long, uppercase, distracting disclaimers of all warranties that some lawyer at Berkeley could think of and enumerate; and that I preferred also to avoid requiring the distribution of a separate LICENCE file with every other file I released, purely for the sake of laziness. What I failed to consider, though, was the actual implications of these messages. Consequently, on realizing this, I briefly considered the following line of reasoning, after encountering another suggestion that works may not clearly be released into the public domain: 1. What matters in the preamble to the software is what terms a court of law will accept and base decisions upon. 2. If a court of law will not accept a statement that the software is released into the public domain, there is no reason for it to prefer the acceptance of a statement of licence terms: one could imagine two copies of a file, one with the terms and one without, and of neither one could anyone claim authority. My second point was faulty, though, because it questioned only witness credibility, which is an unrelated issue; and I omitted a third point, on contract law, warranty, and gifts. I honestly don't know whether a court of law will accept the preamble to a source file, but my guess is that it probably would. The more important question is whether a user (perhaps some corporate entity -- whether good or evil, according to your favourite definition of that spectrum --, or some governmental entity, or anyone) can have confidence in the terms set by the author. There are two aspects of the software in which users may wish to have confidence: (1) the users want to be confident that the copyright and licence terms are fixed, and that the author will never pursue or even threaten litigation for violations of licence; and (2) the users may also want the power to pursue the author for damages if the author's software fails at the cost of the users. Contrariwise, the authors of free software (except perhaps djb) prefer to disclaim themselves of any such warranties. [This entry is not yet finished. Sorry.] Summary: The public domain is murky, even if it may seem to make common sense. I prefer the modified (3-clause) BSD licence for similar terms and greater protection, or at least a clearer idea of the protection for all parties involved. It is also fairly concise. Furthermore, it has been examined by lawyers. [I don't know whether it has been tested in court, and don't have the time or inclination to research this in greater detail. Any references to cases involving the 3-clause BSD licence would be welcome, though.] I repeat, though, that I am not a lawyer, and that this is not legal advice. If the terms of the modified BSD licence are not to your taste, John Cowan has a utility for choosing another one by some simple criteria: . 2007-07-14 Summarizing several issues with compilation and interpretation I ran out of steam writing last month about compilation and interpretation, so here I'll try to summarize as succinctly as I can a few points about which a disturbing number of people are confused. Look to the previous entry for definitions of `interpreter' and `compiler'. 1. Most implementations of Lisps are not pure interpreters; most are based on compilers in some way or another. Call them implementations, not interpreters or compilers: they may *have* interpreters or compilers, but they are not themselves interpreters or compilers. 2. A `compiler' is not a tool chain that turns source code into stand-alone executables. A compiler is a tool that analyzes the meaning of programs and generates programs in another language with equivalent meaning. 3. Your physical machine is an interpreter. The universe is a kind of interpreter as well. 4. No compiled program is `stand-alone'. Not only do compiled programs still require interpreters to execute them, but they also must be structured in the appropriate format for the intended operating system under which they will run, and they generally require support from some run-time libraries. It just so happens that most computers that most people are interested in developing programs for have libc built-in. It also happens that most computers don't already have the run-time libraries needed by Lisp programs. 2007-06-19 Concerning compilation and interpretation, and an interpretation of the compilation of confusion surrounding their meaning and execution I often find myself in an unpleasant proximity to a bickering arglebargle about whether compiled languages or interpreted languages are better; whether Java is compiled or interpreted; whether Sun's Java implementation (or others) is a compiler or an interpreter; whether Lisp can conceivably be compiled, if it has EVAL built-in; how Lisp systems like SBCL can be compilers if they can't spit out files that someone's favourite operating system can execute (which is, in fact, not really true); and other matters of confused frivolity concerning compilation and interpretation. Sometimes I even find a participant or two in the discussion who tries to allay the heat by suggesting that the distinction between compilation and interpretation is murky at best, citing examples like Java implementations that involve both Java to byte-code compilers and byte-code interpreters, or modern x86 implementations, which translate the x86 instruction set every time the i-cache is refilled into an internal more RISC-like format to execute. Sometimes I even do this myself, if I am in a hurry and lack the time to explain the subject more precisely. Let us commence with definitions: /Compilation/ is the analysis of a program's meaning, rendering the same semantics into another language. A /compiler/ is a total function from programs to meanings. /Interpretation/ is the execution of a program given the state of the machine, either diverging or yielding a final answer. An /interpreter/ is a partial function from programs and machine states (or simply machine states, if the program is a part of the machine state) to answers. (Programming language theoreticians in the audience will recognize that compilation corresponds with denotational semantics and that interpretation corresponds with operational semantics. I hope, however, that such members of the audience are already adequately familiar with the concepts that they need read no further. I hope.) A usable language implementation that is based on a compiler also requires an interpreter. When most people hear about `compilers', they are hearing about compilers that map programs in languages like C to machine programs, i.e. programs in languages for the instruction sets implemented by physical processors such as the x86, the PowerPC, the SPARC, the PDP-10, and so on. [This entry is not yet finished.] 2007-05-13 SLIME48 (Apologies: the post prognosticated by the previous preparation has been permanently postponed.) After several long months of version incompatibility, there is a new version of SLIME48 available that will work with CVS SLIME as of 2007-05-13. A tarball is available here: . Bug reports are welcome. Further information is at . 2007-05-09 In preparation for a clarification of macros in Scheme, and their implications on style This blag has remained dormant for a very long time. I must deeply apologize to my avid readers who no doubt were left in a state of prolonged mourning for the stagnancy of their only worthwhile source of meaning in their lives. In a short while, however, no longer will you poor souls remain cold, alone, and bereft of your font of Schemerical wisdom, for I shall soon bring flames upon the now cowering subject of macros in Scheme -- from DEFMACRO to SYNTAX-RULES, and all in between and beyond --, and their philosophical implications on style and expressiveness of programs! For you who cannot stand to wait until the weekend (when I hope to find time for this inflammatory ecriturial extravaganza), I give you a brief taste, the draft of a Lisp style guide whose composition I commenced this past weekend: 2006-09-16 On networking interfaces in Scheme Most Scheme systems have very limited networking interfaces, usually with the notion that BSD sockets are not nice, and that `high-level' verbs like `open a TCP connection to this IP address, on this port' are nice. This is a bad idea, and Scheme systems ought to support wrappers around the whole BSD sockets API. The BSD sockets API is really a fairly extensive generalization of many different flavours of sockets, classifying them according to types with common operations (stream sockets, datagram sockets, &c.), across many different networking domains (IPv4 and IPv6 being the most prevalent today, but of which there are many others, some of which remain useful, like local Unix sockets). But instead of taking advantage of this unification set down by BSD sockets, most Scheme systems *specialize* the interface to provide wrappers for TCP (i.e. internet stream sockets), and sometimes UDP (i.e internet datagram sockets), and sometimes local Unix sockets (with one of two types). In the first place, this is poor practice simply because of the amount of work necessary for the same returns. In essence, the specialized wrappers tear down a layer of abstraction merely by pretending that it isn't there -- while actually building on top of it. The result is that different specialized wrappers must be made for everything intended to be wrapped, and a great deal of code must be duplicated. There is a great deal of C code out there, used as the basis of Scheme networking libraries, that duplicates the same system calls just with different constant arguments (SOCK_STREAM vs SOCK_DATAGRAM, AF_INET vs AF_INET6 vs AF_UNIX, &c.), or different types of socket addresses (sockaddr_in, sockaddr_in6, sockaddr_un, &c.). Few Scheme systems expose these constants and socket address structures to Scheme; most just hide the `messy details' in C. Actually, `the same returns' is not quite true anyway. The wrappers are generally very fixed in form, and inextensible. Most don't even support any socket options, or have specific procedures to deal with specific socket options. Few have any support for multicast IP. I have never seen support for out-of-band data transmission in any specialized socket wrapper for Scheme. The distinction between shutdown and close is usually either absent or swept under a rug. A general BSD sockets wrapper gets all of these for free, on *any* socket type or domain (except for multicast IP, of course), and can be made to be easily amenable to extension for anything not already wrapped, as long as sockets' descriptors can be exposed straightforwardly in an FFI. Specialized wrappers have to duplicate all this functionality for every type and domain of socket they distinguish between, and they rarely expose any internal interface for FFIs -- that, or they simply don't support any of it. Protocol independence in programs is impossible with specialized OPEN-TCP-CONNECTION procedures: IPv6 has been around for over ten years, and many standard libraries support it trivially through abstractions like BSD sockets in conjunction with getaddrinfo(3), yet programs that support IPv6 well are few and far between because programmers specialize the sockets with the AF_INET family -- and Scheme systems that specialize the BSD sockets to this family specifically *discourage* protocol independence. Unfortunately, programmer laziness in conjunction with a misguided desire for `nicer' interfaces tends toward not supporting any more esoteric capabilities of BSD sockets or flexibility for later extension or protocol independence. Of the Schemes I can think of off the top of my head, only scsh and Gauche have BSD sockets interfaces. These all have specialized wrappers, usually just for TCP, and some UDP, though I think none support any local Unix sockets: Scheme48, Chicken, PLT, Gambit, RScheme, MIT Scheme, SISC... 2006-09-14 R6RS The first public draft of the Revised^6 Report on the Algorithmic Language Scheme has been released, at . The module system, the string API, and the arithmetic specification are all still quite broken. A number of other smaller problems pervade the report. They claim that an operational semantics has been written, but it's not included, and the denotational semantics has been taken out. Why, I do not know. I'm not sure whether I have enough interest left in it to rant much more about it. 2006-09-06 Zob veeblefitzer Foo bar baz quux zot. Frob grovel full lexical... Mumble; chumble spuzz. 2006-07-30 Update: Guillaume Germain found in quantum superposition between Washington and Minnesota Title says it all, folks. 2006-05-29 Missing: Guillaume Germain, author of Termite Monsieur Germain was last seen a little over a week ago in the #scheme IRC channel on Freenode, declaring that he was preparing to undertake the dark journey into the Lesser Evil Empire of Amazon. We can be thankful that he turned down Temptation when she came to his door with an interview at the Greater Evil Empire, but now, even at the Lesser one, he has disappeared and not been heard from in over a week. The Scheme Underground has not published a press release stating that he has possibly been detected in a dark, dank, and dingy dungeon in a basement somewhere in Washington, but the details are sketchy. Preparations are not being made for a raid in the area. [Update: M. Germain appeared spontaneously to inform me that his surname is spelt without a trailing `e', and subsequently vanished just as spontaneously. Whereabouts are still unknown.] 2006-05-29 Pink chickens Chickens, under their feathers, are pink. I don't know whether any of you have ever plucked and feathered a chicken, but it turns out pink underneath. At least, this is what I presume, because I have not myself ever feathered a chicken. I have seen only featherful chickens in farms and chickens in supermarkets which other persons have already had the kindness to feather for me. Still, I can assure you that chickens are pink under the feathers. Now, what are really peculiar are chickens whose *feathers* are pink, and I mean not just slightly pinkish but really bright, hot pink. These things are really bizarre, and I have never seen one in my life; nor have I ever heard of them. But can you imagine seeing a chicken there, yet with blazing pink photons streaming at your retinas, instead of their ordinary brownish reddish hue? [Update: I am informed, among the voluminous profusion of commentary from readers, that an ideal such chicken would be five feet in height and capable of wearing a saddle.] 2006-05-01 On control brackets and resource release [Note, 2010-01-01: There are some errors in this entry. See the entries 2009-02-04 Correction to control brackets and resource release 2009-03-21 Correction to the optimization of resource release in control brackets. For a conciser summary, see the entry 2009-03-28 More simply addressing the confusion about UNWIND-PROTECT in Scheme.] DYNAMIC-WIND has long been a source of contention in the Lisp world. It serves a very particular purpose, but this purpose is not very well understood, and many think that this purpose is the same as the purpose of other Lisps' UNWIND-PROTECT. While the general idea of UNWIND-PROTECT is useful, it is not directly applicable in Scheme, because Scheme has more powerful control abstractions than other Lisps', and it therefore requires more powerful means of controlling them. (UNWIND-PROTECT form protection-form ...), a special form, evaluates the first form and returns its value, but before returning the value also executes the protection forms. Furthermore, if a throw in the primary form transfers control to outside the whole UNWIND-PROTECT form, this, too, will execute the protection forms. This allows programmers to *reliably* clean things up at certain points in the program -- for instance, to close open files, to shut down sockets, to release database handles, &c.; here is hypothetical example of its use: (define (call-with-input-file pathname procedure) (let ((input-port (open-input-file pathname))) (unwind-protect (procedure input-port) (close-input-port input-port)))) (DYNAMIC-WIND before during after), a procedure of three procedural parameters, calls the during thunk, ensuring that any transfer of control into it calls the before thunk, and that any transfer of control out of it calls the after thunk. Normal control transfers, i.e. the initial call to the during thunk and its final return, qualify as control transfers too. This means that, any time control is inside the dynamic extent of the during thunk, any state established by the before thunk will be in effect, but any time control is outside its dynamic extent, that state will be torn down by the after thunk. Here is an actual example of DYNAMIC-WIND's use, from the Edwin text editor: (define (with-current-local-bindings! thunk) (dynamic-wind (lambda () (install-buffer-local-bindings! (current-buffer))) thunk (lambda () (uninstall-buffer-local-bindings! (current-buffer))))) This ensures that, within THUNK, all Edwin variables have any values specified by the buffer's local bindings. This is the kind of use that DYNAMIC-WIND is appropriate for: a sort of control bracket that ensures the presence of certain dynamic state or context within a certain dynamic extent. Now, unfortunately, DYNAMIC-WIND can be seen as somewhat similar to UNWIND-PROTECT, because for basic cases (involving no sophisticated exploitation of Scheme's continuation reification) DYNAMIC-WIND with a null entrance thunk seems to be equivalent to UNWIND-PROTECT. This is misleading, however. Either control may return to the dynamic extent of the during thunk in the DYNAMIC-WIND, in which case the after thunk will be called twice -- which is unacceptably wrong --, or the before thunk really signals an error if it is called twice. This, though, violates the philosophy described in the first sentence of the R5RS: Programming languages should be designed not by piling feature on top of feature, but by removing the weaknesses and restrictions that make additional features appear necessary. Signalling an error in a DYNAMIC-WIND before thunk is usually a limitation and a weakness in a program. We have an *extremely* powerful means of abstraction of control -- CWCC[1] -- and we ought not to prevent it from working merely because we want to prevent a certain fragment of code from running more than once. This should, however, spur a deeper look at why the code is being run more than once in the first place. DYNAMIC-WIND is meant to establish local dynamic state for specific dynamic extents within programs, by setting it up in before thunks and tearing it down in after thunks. These operations, to set up and to tear down dynamic state, are meant to be reversible, local changes.[2] That is, permanent actions, such as closing a port, that one would put in a protection form of an UNWIND-PROTECT, are entirely inappropriate in a DYNAMIC-WIND after thunk. Yet the basic idea of UNWIND-PROTECT is still useful. What, exactly, though, is useful from UNWIND-PROTECT, and what is incompatible with Scheme? The concept of permanent finalization is useful, and the identification of a good point to perform it is useful. *Requiring* that it be performed at this point, though, is incompatible with Scheme, because this point may be passed several times, yet we must perform the finalization once and only once. Many languages, and many Scheme systems, provide ways to ensure that certain code be run when an object is no longer reachable. We can apply this same idea, in fact, to continuations! The protection thunk in an UNWIND-PROTECT of a lesser Lisp is run only once the continuation is about to become unreachable. The only difference between Scheme and CWCCless Lisps is that the point at which continuations become unreachable is unfixed in Scheme. So we must use a general object finalization mechanism -- for the kind of object known as a continuation. This is easy to concoct, though: we need only to make an object, to associate with that object the protection thunk so that it will be executed when the object become unreachable, and to make a link from the continuation to that object. There is an implementation of this in MIT Scheme at the end of this entry. This, though, is non-optimal. The protection thunk may run at any time in the system. We should *like* it to be run as soon as control returns through the continuation in question, but this is valid only if that continuation has not been reified and will not ever be reused. We can say that a continuation is /dirty/ if it has been reified and may be instantiated again, and that a continuation is /clean/ if and only if there will be exactly one instance of it, which will be destroyed as soon as control returns through it. If we store a bit in any continuation involved with UNWIND-PROTECT, i.e. any continuation with an associated protection thunk, then we can initially clear the bit, to indicate that it is clean, and instrument CWCC to mark all such continuations as dirty by setting their bits. When control returns through a protected continuation, it checks the bit. If the bit is clear, then the continuation is clean, and it runs the protection thunk, because control will never again return to this point; it also deregisters any finalization procedure. If the bit is set, then the continuation is dirty, and general finalization system for objects of unlimited extent will handle the protection thunk. (We need to be careful in the case of a reified continuation instantiated on the stack whose heap storage has been reclaimed. The continuation will be marked dirty, but before control returns through it for the last time its protection thunk might be run, because the GC could finalize its heap storage.) Both DYNAMIC-WIND and UNWIND-PROTECT are useful for different applications in a language with powerful control abstractions such as Scheme. DYNAMIC-WIND, though, does not subsume the operation of UNWIND-PROTECT, and UNWIND-PROTECT cannot be taken literally from a language with weaker control abstractions. By examining the core of the intent of UNWIND-PROTECT, though, we can separate its actual purpose from an optimization. We can implement the actual purpose, and then analyze cases where the optimization can still be useful, in order to provide UNWIND-PROTECT efficiently in Scheme. The result is that we can retain the powerful control abstraction of CWCC and in the same language integrate the same idioms of resource release as other languages provide. Of course, there are still some limited cases where restriction is really necessary. There are a number of procedures in MIT Scheme that work with sensitive strings, and into which reentry is not an option, in order to enforce the proper destruction of sensitive information. In this case, DYNAMIC-WIND is appropriate to reject reentry and to nullify sensitive strings on exit. [1] Some would argue that composable continuations ought to be enough for everyone, but the issues here are exactly the same whether we use CWCC or composable continuations. [2] This is not *exactly* true, but symmetry is a very desirable property of DYNAMIC-WIND before & after thunks, and reversible set-up & tear-down operations are usual. Appendix: Variation on a Theme of UNWIND-PROTECT (define continuation-protectors (make-gc-finalizer (lambda (protector) (protector)) cell? cell-contents set-cell-contents!)) (define (make-protector-cell protector) (let ((protector-cell (make-cell (let ((protector-state (get-dynamic-state))) (lambda () (let ((state (get-dynamic-state))) (set-dynamic-state! protector-state #f) (protector) (set-dynamic-state! state #f))))))) (add-to-gc-finalizer! continuation-protectors protector-cell) protector-cell)) (define get-dynamic-state (access GET-DYNAMIC-STATE (->environment '(RUNTIME STATE-SPACE)))) (define set-dynamic-state! (access SET-DYNAMIC-STATE! (->environment '(RUNTIME STATE-SPACE)))) (define-syntax unwind-protect (syntax-rules () ((UNWIND-PROTECT form protection0 protection1 ...) (*UNWIND-PROTECT (LAMBDA () form) (LAMBDA () protection0 protection1 ...))))) (define (*unwind-protect thunk protector) (let* ((protector-cell (make-protector-cell protector)) (result (thunk))) ;; This ensures that the continuation will hold a reference to ;; the cell, so that it will not be garbage-collected until ;; after either control returns to this continuation or a ;; throw causes this continuation to be discarded and made ;; unreachable. (identity-procedure protector-cell) result)) 2006-04-17 Microsoft vermin I am shocked -- *shocked!* -- to hear that the developer of Termite had an interview for a potential job at Microsoft. (Termite, for you ignorant wretches, is Erlang-in-Scheme for Gambit. Find it on your own.) 2006-04-07 XEmacs S-expression motion Turns out paredit.el 20 beta doesn't support XEmacs very well after all. The problem is that XEmacs's S-expression motion commands don't signal SCAN-ERROR conditions like GNU Emacs does: they just signal general ERROR conditions. Programs *cannot* usefully discriminate S-expression motion scan errors from other kinds of errors, and, on software engineering principles, I refuse to modify the code to catch and ignore all errors. Grrrmph. (If any XEmacs developers read this (as if): Could you please either explain why the S-expression motion code signals ERROR conditions rather than SCAN-ERROR conditions in specific, or, if there is no good reason, fix XEmacs?) 2006-03-30 Immobilizing blobs There is a furry, orange blob on my lap preventing me from moving. I believe the scientific classification for this kind of blob is `cat'. Luckily, the state of immobility is only temporary, apparently regulated by some sort of nutritive urges in the blob, and also occasional boredom and needs for entertainment. 2006-03-23 I got better... For those of you who actually *read* this thing and were a bit worried after my 2006-03-07 post, I've recovered mostly, although I'm maintaining a good cough. But apparently you remain ill, since you haven't overcome the affliction of blag addiction, as you're still reading this now. 2006-03-10 paredit.el for XEmacs; deletion keys Recently I have been fiddling with making paredit.el work for XEmacs. Not many changes seem to be necessary -- although I haven't yet tested it exhaustively under XEmacs, as I'm basing my changes on Martin Gasbichler's port to XEmacs --, since paredit.el consists mainly of pretty basic elisp, but one of the most annoying problems to deal with was getting the deletion keys (forward-delete and backward-delete, or Delete & Backspace) right. [I forgot to finish this post, so it was sitting in an autosave file for two weeks, and, though I had a big rant all planned out two weeks ago about getting deletion keys right, I've forgotten what I had to say. Sorry. Also, paredit.el for XEmacs isn't finished yet because I keep forgetting to get around to it.] [Update, 2009-12-31: Thanks to gjs, I now have an arbitrarily large collection of round tuits. Fortunately, I think I made paredit run on XEmacs a couple of years ago.] 2006-03-07 More flu! Not only was the flu great fun, but, after only a few days of having seemed to be mostly rid of it, it has decided to come back to bring *more* joy to my life! Mmmmm... 2006-02-24 Flu! The flu is great fun. I recommend that any of you who have not had the pleasure of experiencing it, find a contagious surface *right*now* and infect yourselves with the glee of knowing what fun and games await you! 2006-02-23 Paredit.el versions 18 & 19 This happened a while ago, but paredit.el 18 is now the released version, and paredit.el 19 is the new beta. See ; find the relevant files yourself. Version 19 supports insertion of balanced delimiters other than parentheses, and a fancy S-expression transportation device suggested by Guillaume Germain, but I haven't yet finished the latter and put it up on the web yet. 2006-02-06 On the dangers of blaggery Until several acquaintences coerced me into recording a `blog' (a term which I abhor almost as much as the concept it denotes), I tended to avoid the the things in general, excepting semi-frequent browsing of Planet Lisp . There are a number of reasons for my eschewal of the reading or writing of blags. The first reason is the content, or absence thereof. There is a tendency among blag writers to write when it amuses them to write, not when they have anything to say. I need not elaborate further on this point; I am sure that the reader is quite familiar with the saturating profusion of content-free blags throughout the web. The second reason is that blags are astonishingly efficient sinks of time. Right now I ought to be doing something else. But I am not doing something else. This is because there is some aspect of the nature of blags that compels their authors to write entries, and, even in the rare case that the author has anything worthwhile to say, the time composing the entry would be better wasted on a different amusement device, such as a cat. (More accurately put, the author ought to spend more time amusing the cat than dribbling drivel onto the web.) This urge to write blag entries instead of doing useful things such as acquiescing to feline demands takes its toll also in other ways. For example, programmers will often briefly emit mutterings on their blags about their latest objects of fascination, without releasing their code. If the existence of that code is an equivocal one, this sputtering of substanceless spoutage is known in some circles as `wanking', a technical term. Wanking is not, and never has been, desirable behaviour to be associated with. Actual code can speak for itself, but speaking for that which has no existence, or at best a vaporous existence, is about as useful to others as a fish is to a bicycle. Or something like that. Yet these programmers persist in writing blag entries instead of substantial code. Finally, the blag is usually directed at an unclear audience. Since there is no particular audience that the blag is intended to cater to, and so for any particular audience much information is discussed in much greater detail than is necessary, and much other necessary information is omitted. Arguments are ill-defended because the blagger has no connection to his audience when writing the blag entry; he must build all the argument on a nebulous idea of what and whom he's trying to argue against. The result is poorly laid-out, ill-conceived, and unpersuasive arguments or other types of entries. Thank you for slogging through the awful post above. I hope that its mere form and existence have convinced you of what it failed to argue for itself. At the very least, perhaps it will dissuade you from assigning further merit to this blag! 2006-02-04 Disturbing blaggery I have noticed a disturbing urge to write more material in this joke of a text file in the past few days. I will have to quell this strange compulsion. I feared it would happen from the start. 2006-02-03 Option types, optional parameters Languages like ML and Haskell that are heavily based on pattern matching have a very simple but elegant mechanism for representing value and absence of value without employing a special value that is still a value like any other: option types. The basic idea is that there is a type which I shall denote with (OPTION ), where is the type of the value that may be present. Two kinds of terms have this type: (SUPPLIED ) and (ABSENT), where has the type . For example, a TABLE-ENTRY procedure might return (SUPPLIED ) or (ABSENT) to denote that there is no value in the table associated with the given key. The idiom in Lisps tends to be to use the false constant (#F or NIL) for representing the absence of value, but this inhibits #F or NIL from being itself a present and useful value. (The conflation of the false constant and the empty list in old, crufty Lisps exacerbates this.) This poses a problem, for example, if you want a general table mapping some key to any object -- including #F -- but you also want to be able to determine when an association is absent in the table. I call this conflation /domain contagion/, because NIL or #F or the symbol NOT-HERE are all still in the domain of expressible values, when they should be kept apart, because their whole purpose is to express the *absence* of an expressible value. There are actually two different problems here. The first I shall leave for another day, although it is more relevant to the TABLE-ENTRY example. It is that procedures like TABLE-ENTRY must convey information about control flow to their callers via data flow. This indirection ought not to be necessary. Olin Shivers wrote a paper called `Multi-return function calls' (although I think that `multi-continuation procedure calls' is a better term) about this fairly recently. The basic idea is that, rather than giving a procedure a single continuation, which itself will discriminate the data that the procedure returns to it, the caller supplies a *choice* of several continuations, and the callee picks one to return to, discarding the rest. The problem of denoting absence of value in procedures like TABLE-ENTRY can be solved by passing two continuations to TABLE-ENTRY: one for the case where there is a value (which continuation accepts one argument), and one for the case where there is no value (which continuation is nullary). I recommend the paper; I may write something on the subject later, too. The other problem is not in eliding the indirection of data flow, but in storing data. Here is a simple example: optional parameters. There is no control flow involved here. If a procedure accepts optional parameters, it simply has several variables which may or may not have been supplied as arguments. Now, while there are some cases where it would suffice to specify in the lambda default values for the case of absence of the arguments, there are other cases where it is useful either to pass on the optional parameter's value to another procedure (and retain the supplied/absent status) or to conditionalize on whether the parameter was supplied in order to do something fancier altogether. Here are several strategies currently employed by existing Scheme systems: 1. Give no provision for specifying default values in the lambda syntax itself, and instead a. use a special value, such as #F (DSSSL) or some `default object' token (current MIT Scheme). -- This is subject to domain contagion as discussed above. b. make references to the variables corresponding with optional parameters trap, and provide a special form DEFAULT-OBJECT? which tests whether references to those variables would trap in order to conditionalize on their presence as arguments (older MIT Scheme). -- This requires some pretty kludgey changes to the semantics of variables in the language. Furthermore, it inhibits the passage of unsupplied optional parameters on to other procedures: you can't simply pass on an unsupplied parameter to another procedure by passing the value of the corresponding variable, because references to that variable trap. (MIT Scheme no longer uses this semantics for optional arguments.) 2. Require that the default value be supplied. (PLT's OPT-LAMBDA) -- This is inconvenient in some cases, and, while there are ways around domain contagion with this strategy, they are pretty inconvenient and they defeat simple static type analysis (which is still useful for dynamically typed languages like Scheme): (LET ((NOT-HERE (CONS 'NOT-HERE '()))) (LAMBDA (... #!OPTIONAL (X NOT-HERE) ...) ... (IF (EQ? X NOT-HERE) ...) ...))) 3. Bind another variable to a boolean flag indicating whether an optional parameter was supplied. (Common Lisp) -- This fixes most of the disadvantages of the above, but it still does not allow the convenient passage of unsupplied optional parameters on to other procedures. It can also get to be syntactically pretty heavy-weight in usage. All of the problems of domain contagion, type analysis defeat, passage of unsupplied optional parameters onto other procedures, and complexity of usage, would be alleviated by the introduction of an option type as in Haskell or ML (or even dynamically typed languages heavily focussed on pattern matching like Erlang). There would be two remaining issues: undesirable boxing, and the method by which to conditionalize on option types. When it comes to data already stored in the heap, the boxing of an option type is not much of an issue; it is merely one more layer of indirection. In fact, on, say, a 64-bit system, where three tag bits is ideal, one possibility would be to use one of those tag bits for flagging option types. (This may not be a great idea, but it is an idea nonetheless which might bear worth experimenting.) As for optional parameters, for which run-time costs of heap allocation is not a reasonable option (sorry for the pun), we must first assume that the compiler knows about the option type in the first place; with this assumption in mind, it is not difficult for the compiler to recognize trivially optimizable tests of the option's presence. For example, it could map such tests to comparisons against an NARGS register. The syntax for testing optional data might be, without pattern matching, a bit clumsy. There are a couple ways it might be accomplished: a new special form could be added, (IF-OPTION (