Do Something Weird

"This is the kind of possibility that the pointy-haired boss doesn't even want to think about. And so most of them don't. Because, you know, when it comes down to it, the pointy-haired boss doesn't mind if his company gets their ass kicked, so long as no one can prove it's his fault. The safest plan for him personally is to stick close to the center of the herd.

Within large organizations, the phrase used to describe this approach is "industry best practice." Its purpose is to shield the pointy-haired boss from responsibility: if he chooses something that is "industry best practice," and the company loses, he can't be blamed. He didn't choose, the industry did.

I believe this term was originally used to describe accounting methods and so on. What it means, roughly, is don't do anything weird. And in accounting that's probably a good idea. The terms "cutting-edge" and "accounting" do not sound good together. But when you import this criterion into decisions about technology, you start to get the wrong answers.

Technology often should be cutting-edge. In programming languages, as Erann Gat has pointed out, what "industry best practice" actually gets you is not the best, but merely the average. When a decision causes you to develop software at a fraction of the rate of more aggressive competitors, "best practice" is a misnomer."
—Paul Graham, Revenge of the Nerds

Advent of Code 2017: Week 1

After 7 days, let's see how my Advent of Code 2017 is going.

  • Web framework: I made a standard web console, which I then copy forward to the next day. I could just as easily have put all the buttons on a single page, but it'd get too long by day 31.

  • Unit testing: As seen in stdlib.js, my test framework is very simple: On page load, setup, run some asserts, finish to get stats and redbar/greenbar the test console. This has been a great win, even though several of the days had only one or two examples.

  • 01:

    "The captcha requires you to review a sequence of digits (your puzzle input) and find the sum of all digits that match the next digit in the list. The list is circular, so the digit after the last digit is the first digit in the list."

    • So this is nice and easy, linear problem for JS string processing. On the second variant I turned the inner function to pick the next char into a function, which I pass in. No external state.
  • 02:

    "The spreadsheet consists of rows of apparently-random numbers. To make sure the recovery process is on the right track, they need you to calculate the spreadsheet's checksum. For each row, determine the difference between the largest value and the smallest value; the checksum is the sum of all of these differences."

    • Needed to write more utils, this time to process strings into a table of numbers. My first solution kept it as a table of strings, and then I had problems with JS type coercion. Same strategy of finding a pure functional inner loop and extracting that as a function. Actually, subtotalFunc1 mutates cols, but you'll never use it again so it doesn't matter.
  • 03:

    "Each square on the grid is allocated in a spiral pattern starting at a location marked 1 and then counting up while spiraling outward. For example, the first few squares are allocated like this:

    17 16 15 14 13
    18 5 4 3 12
    19 6 1 2 11
    20 7 8 9 10
    21 22 23---> ...

    While this is very space-efficient (no squares are skipped), requested data must be carried back to square 1 (the location of the only access port for this memory system) by programs that can only move up, down, left, or right. They always take the shortest path: the Manhattan Distance between the location of the data and square 1."

    • This was a little insane. There's apparently a pure math solution, but I am not a mathematician, I am a turtle; also song; also Lewis Carroll; also Gödel Escher Bach. So I solved it by moving a turtle around the spiral, turning left whenever there's an open space, and then I could just examine the spiral for values.
    • Yes, I store points as a vector-2 of numbers, rather than making a "Point" class with x,y. In C using a struct makes more sense (since it only takes 8 bytes intead of the 40+ in any object system!), but in anything else, the vector requires the least memory, least work, and is easiest to serialize, use as a hashtable key, and so on.
    • I just copy-pasted and modified for the second task, instead of making a nice inner function to pass in. It's not incredibly hard to parameterize, but I would need to break out of that inner function to return early, so probably using an exception for flow control?
    • Javascript objects are super-useful, like Python dicts but even easier to use. Stuff a bunch of points for the grid, and properties for the minPt, maxPt (bounds) and lastPt, one structure gets to hold everything about the spiral. Writing this in a type-safe or pure functional language would be annoying.
  • 04:

    "A passphrase consists of a series of words (lowercase letters) separated by spaces.
    To ensure security, a valid passphrase must contain no duplicate words."

    • Super simple, a histogram that only counts to 1. The second part only requires sorting the chars in each word, so you can check they aren't anagrams. I was worried that day 3 was the start of an exponential curve in difficulty, so by the end it'd take to the end of the Earth to finish.
  • 05:

    "The message includes a list of the offsets for each jump. Jumps are relative: -1 moves to the previous instruction, and 2 skips the next one. Start at the first instruction in the list. The goal is to follow the jumps until one leads outside the list.
    In addition, these instructions are a little strange; after each jump, the offset of that instruction increases by 1. So, if you come across an offset of 3, you would move three instructions forward, but change it to a 4 for the next time it is encountered.
    How many steps does it take to reach the exit?"

    • I returned the program counter at first, and that passes the sample data test, so I "guessed" wrong the first time, but then reread and found my bug. Having only one example is a problem. Both parts are as usual solved by passing in a function to determine the next state of the instruction. Trivial one aside from my stupid bug.
    • I'm building up a good library of tools by now.
  • 06:

    "In each cycle, it finds the memory bank with the most blocks (ties won by the lowest-numbered memory bank) and redistributes those blocks among the banks. To do this, it removes all of the blocks from the selected bank, then moves to the next (by index) memory bank and inserts one of the blocks. It continues doing this until it runs out of blocks; if it reaches the last memory bank, it wraps around to the first one.
    The debugger would like to know how many redistributions can be done before a blocks-in-banks configuration is produced that has been seen before."

    • Making the balancer was straightforward, I stored each state as a string in a history array, and a parallel histogram to catch the duplicate; I could get rid of hist but it turned out to make the second task easier.
  • 07: (Don't read this or the code until tomorrow if you don't want a spoiler)

    "You offer to help, but first you need to understand the structure of these towers. You ask each program to yell out their name, their weight, and (if they're holding a disc) the names of the programs immediately above them balancing on that disc. You write this information down (your puzzle input). Unfortunately, in their panic, they don't do this in an orderly fashion; by the time you're done, you're not sure which program gave which information.
    Before you're ready to help them, you need to make sure your information is correct. What is the name of the bottom program?"

    • Building the tree was a good puzzle, but not hard: Parse the text into nodes, keep them in a dictionary, then build the structure, and return the only parentless node (the root). Making a recursive toString so I could debug it was important…
    • Second task looked to be tedious (depth-first search and pass the result all the way up), but then I just looked at my output and saw the unbalanced numbers, so entered it by hand. I am a computer, too.

I've got both gold stars each day. I'm completely incapable of reliably checking in at exactly 21:00 PST, and I'm too fussy about my code to ever be "the fastest", so my rankings are awful; maybe it ought to count from when you read the problem set, but then everyone would cheat on at least the first task.

Advent of Code 2017

I've joined the Advent of Code, and I'll be doing it in JS. Got my 2 stars for the day.

For good discipline (or as a handicap), I'm building a halfway-decent set of pages, unit testing framework, and sort of doing things right (good ES6 practices) instead of easy (hack some inline JS in compatibility mode). I'll link it in the sidebar tomorrow, when challenge 1 expires: My Advent of Code

Go on and do it yourself! I say "easy mode is for babies" and make it hard on myself, but really you can do this in anything. There's a perfectly nice Chipmunk BASIC or FreePascal if you're old-school.

Note: The leaderboard reset time is ridiculous, and I don't care about speed-coding or leaderboards. Don't stress about competing for first 100 completions, just do the thing.

Swiftian Satire, or Tragedy?

I honestly cannot tell if Swift developers are seriously eating Irish babies, or taking the Mickey.

From bad implementations of Equatable and Hashable, the wag leaps to:

extension GridPoint : HashVisitable {
    func hash<H: Hasher>(_ hasher: inout H) {
        self.x.hash(&hasher)
        self.y.hash(&hasher)
    }
}

Bravo! That's easily the funniest punchline to a programming joke since "where do you think the chaos came from?"

Starts with the most bizarre strawman Objective-C Sieve of Erathosthenes I've ever seen. A real implementation would be in C, because Obj-C is C with objects, and it'd be massively faster:

#include <stdlib.h>
#include <stdio.h>

typedef char BOOL; // or link in Foundation
#define YES 1
#define NO 0

int main(int argc, char **argv) {
    if (argc != 2) {
        printf("Usage: primes COUNT\n");
        exit(1);
    }
    long n = atoi(argv[1]);
    BOOL p[n];
    p[0] = NO;
    p[1] = NO;
    for (long i = 2; i < n; ++i) {
        p[i] = YES;
    }
    for (long i = 2; i < n; ++i) {
        for (long j = i*i; j < n; j += i) {
            p[j] = NO;
        }
    }
    for (long i = 1; i < n; ++i) {
        if (p[i]) {
            printf("%ld ", i);
        }
    }
    puts("");
    return 0;
}

This does use more lines of code, but they're short, low-density, and it's instantly obvious what it's doing (my predilection for 1-char var names aside). Can you actually decode his filter-based version?

func sieve(_ sorted: [Int]) -> [Int] {
    guard !sorted.isEmpty else { return [] }
    let (head, tail) = (sorted[0], sorted[1..<sorted.count])
    return [head] + sieve(tail.filter { $0 % head > 0 })
}

let numbers = Array(2...1000000)
let primes = sieve(numbers)
print(primes)

And the runtime experiment:

mdh@Aegura:~/Code/CodeC% time clang -O3 -o primes primes.c                
clang -O3 -o primes primes.c  0.04s user 0.41s system 83% cpu 0.534 total
mdh@Aegura:~/Code/CodeC% time ./primes 1000000 >~/Desktop/primes.txt      
./primes 1000000 > ~/Desktop/primes.txt  0.02s user 0.00s system 89% cpu 0.025 total

mdh@Aegura:~/Code/CodeSwift% time swiftc -O -o swiftPrimes swiftPrimes.swift
swiftc -O -o swiftPrimes swiftPrimes.swift  0.57s user 0.64s system 69% cpu 1.754 total
mdh@Aegura:~/Code/CodeSwift% time ./swiftPrimes >~/Desktop/swiftPrimes.txt  
./swiftPrimes > ~/Desktop/swiftPrimes.txt  51.79s user 26.49s system 99% cpu 1:18.78 total

So the naïve C implementation is about 3,151x faster. I can't measure it precisely because a limit measurable in C, would take Swift until the heat death of the Universe.

So here's my question: Is Vincent aware of this, and his theme of "diabetes", "sugar", "saccharine", etc. pointing at how fat, bloated, slow, and deadly Swift is? He never lets on if this is a joke, he keeps tossing more syntax layers on top of Swift.

Not So Easy to Get a Program Right

"By June 1949 people had begun to realize that it was not so easy to get a program right as had at one time appeared. I well remember when this realization first came on me with full force. The EDSAC was on the top floor of the building and the tape-punching and editing equipment one floor below on a gallery that ran around the room in which the differential analyser was installed. I was trying to get working my first non-trivial program, which was one for the numerical integration of Airy's differential equation. It was on one of my journeys between the EDSAC room and the punching equipment that hesitating at the angles of stairs the realization came over me with full force that a good part of the remainder of my life was going to be spent in finding errors in my own programs."

-Maurice Wilkes, Memoirs

TPortableNetworkGraphic

Trying to load a PNG image and display it on the canvas has been… not fun. Delphi only knew about BMP stored in some horrible Microsoft resource format, and so everything FreePascal adds on top is a pile of hacks or undocumented features. The WWW was not especially helpful.

Mostly I'm posting this so someone else might find it in a search.

uses Classes, SysUtils, Controls, Graphics, LCLType, types, contnrs;

{
type
    TSprite = class
    public
        color: integer;
        filename: array of utf8string;
        subrect: array of TRect;
        constructor Create(c: integer; f: utf8string; r: TRect);
    end;
}

var
    imageCache: TFPObjectHashTable;

function getImage(filename: utf8string): TPortableNetworkGraphic;
var
    img: TPortableNetworkGraphic;
begin
    img := imageCache.Items[filename] as TPortableNetworkGraphic;
    if img = nil then begin
        try
            img := TPortableNetworkGraphic.Create();
            img.loadFromFile(filename);
            imageCache.Items[filename] := img;
        except on e: Exception do begin
            LogError(Format('Image %s: %s', [filename, e.message]));
            raise;
        end;
        end; // try
    end;
    Result := img;
end;

procedure drawSprite(canvas: TCanvas; spr: TSprite; srect: TRect; tick: integer);
var
    img: TPortableNetworkGraphic;
    frame, nframes: integer;
begin
    if spr.color <> dawnUndefined then begin
        canvas.Brush.color := spr.color;
        canvas.FillRect(srect);
    end;

    nframes := length(spr.filename);
    if nframes > 0 then begin
        frame := tick mod nframes;
        img := getImage(spr.filename[frame]);
        canvas.CopyRect(srect, img.canvas, spr.subrect[frame]);
    end;
end;

Pascal Learning Curve

What I've learned so far:

  • I spent a while trying graphics libraries (or failing to even compile them) before deciding I don't understand the UI model enough yet, so I'll prototype with some high-level drawing and circle back around to OpenGL or SDL.
  • Build a do-nothing app in Lazarus, make a single form with a default FormCreate method, then quit out and write code starting from there in BBEdit.
    • Part of that is that I'm not going to use a ton of GUI components, and code building is the evil opposite of Interface Builder. In IB, you edit UI and connect it to method names scanned out of the source code, it doesn't touch your code.
    • The Lazarus editor is nightmarishly wrong and keeps inserting stuff in my code which makes me crazy. Maybe there's non-crazy-making settings, and probably it seems fine to masochistic Windows and Linux users, but I make enough problems in my life.
  • Naming conflicts are a giant problem, so my current naming scheme is: For class "foo", it goes in file FooUnit.pas, containing unit FooUnit and type TFoo = class…. I'm naming instance fields _bar and accessors bar() and setBar() as I do in most languages. I've mostly got the compiler to stop screaming at me every build. Not letting you name a unit, class, and field the same thing is infuriating.
  • Build with lazbuild -B --bm=Release whatever.lpi; I wrote a script to choose Debug or Release builds and launch the app if nothing went wrong, which is close enough to hitting Cmd-R.
  • Bookmark the docs for:
    • RunTime Library
    • Free Component Library - in particular unit 'contnrs' wants to buy some vowels but has dictionaries, lists, etc.
    • Lazarus Class Library
    • There's very little explanation, so often I have to go digging in source like /Developer/lazarus/lcl. I lost about 30 minutes today because they didn't document that TCanvas.FillRect uses Brush settings, TCanvas.Rectangle uses Pen settings, and I figured it out by reading the Carbon implementation. ? ☕️ ?
  • Almost always if there's a non-domain-specific type I need it already exists. Batteries are included but mostly they're named badly, or upside down, or hidden in sofa cushions, or the dog buried them and I need a metal detector, or my psycho ex stole them and is holding them hostage for a pity fuck.
  • The actual implementation code isn't much different from any other procedural language. For a guy who codes in Pascal for one year every 10 years, it's rolling along pretty fast. The near-equivalency of records and classes, and of functions, properties, and methods is convenient. Defining vars before using them, like in old-timey K&R C, is not convenient. Inline variable definition would be a gigantic quality of life improvement, which I doubt they'll do.

Pascal

Trying out alternative languages to work around my performance and native binary problems, I've circled back around to the '70s and '80s: Pascal. I used classic Pascal on Atari 800 and TRS-80 back in the day, and did quite a bit with Kylix (Linux version of Delphi) before Borland killed that.

Pro

  • Fast Compiles. FreePascal compiles faster than anything I've used in ages. That was always a major Pascal selling point, and it still is. Optimized for programmer time.
  • Fast Runtime. As close to perfectly optimized machine code as you're going to get. Computer Language Shootout has competitive times with C++ for most benchmarks, and I think the worst-cases are variations in style.
  • Object-Oriented. FreePascal reimplements Delphi-style objects, which are pretty standard Simula-type OOP. I dislike having to tag methods as virtual, like some C++ or Swift peon, but it has everything I'd expect in a modern OOP system.
  • Reference Counting. No GC pausing, no manual memory management. Like Objective-C 1.0, you have to nil-out field references in your destructor, but otherwise you never need to worry about it.
  • Exceptions. Unlike Objective-C and Swift (which relies on Obj-C frameworks), you can throw exceptions and catch them and the program keeps working. Hooray! Flow control that isn't insane! There's no checked exceptions, which is sad, but it works.
  • Cross-Platform. Mac, Linux, Windows, Android, and iOS. Has SDL and OpenGL bindings, and some other options. I'll see how building out UI for each of those works, but it's not trapped on Mac like most other choices.
  • Native Binary. No source code included in the downloaded app. Dynamic language obfuscators are a minor obstacle at best, while machine language is hard enough to decompile. Sure, the other option is to put the program online and just have a thin client in the user's hands, but I'm old-fashioned, I believe in networkless programming and not paying Amazon for server time.
  • Easy Native Library Integration. Pretty much seems to be defining functions as external and calling.
  • Case-Insensitive. I'm usually neurotic about proper capitalization, but here it's a mercy: The classic Pascals were all-uppercase, Delphi CapitalizedEveryWord, but I prefer lazyCaps. FreePascal doesn't care.
  • Real Programs. There's working programmers using FreePascal to keep their (often very expensive) Delphi software running, and writing new code in it. That makes me confident it's not an unsupported toy, and there's current documentation and help.
  • BBEdit. Object Pascal syntax mode works fine.

Con

  • Bondage & Discipline. Not quite as BDSM as Java, Swift, Haskell, or classic ISO Pascal, which in practice have no safewords. You can use dynamic arrays, Variant and OOP types, and even dangerously cast anything to anything or screw around with pointers, but it's not beautiful anarchy like Python or JavaScript.
  • Pascal Syntax. Verbose begin/end pairs everywhere, long words of function, procedure, and such. Semicolon rules are insane (yes, they're terminators not separators; this is not how we use them in any other language, including English), I've taken to just always using begin/end blocks because I don't trust a misplaced semicolon not to terminate the wrong block.
  • Documentation. FPC's docs assume you already know Delphi. I found some decent docs at Borland's site and old Pascal textbooks, but I dunno how a normal person would learn this. Some of the libraries have moved in 3.0, and you're never going to figure this out unless you like digging thru the guts of a language.
  • Configuration. Put this in fpc.cfg somewhere, and export PPC_CONFIG_PATH to the path containing it:
    #WRITE Compiling with fpc.cfg
    -O3
    -Xs
    -MOBJFPC
    -Sh
    -Fu/usr/local/lib/fpc/$fpcversion/units/$fpctarget/rtl-console
    -Fu/usr/local/lib/fpc/$fpcversion/units/$fpctarget/regexpr
    

    The write is just a sanity check that I have it configured. Instead of the next two lines, I could put -dDEBUG or -dRELEASE on the fpc command-line, but I'm not currently using gdb (unfrozen caveman Mark debugs by writeln), so this is easier. -MOBJFPC forces modern FreePascal mode, not a compatibility mode. -Sh forces a default string type of ansistring instead of shortstring; but to be precise, I always specify utf8string. The -Fu lines add some paths where libraries have been moved.

    I want to have the local directive {$M+} (reflection support) always turned on, but I can't figure out any command-line option to do that.

  • Look Like a Crazy Person. But sometimes the crazy people are right.

Example

To (re)learn the language, I wrote a 4-function RPN calculator for Mac console: RealCalc

Presumably it compiles just fine on Windows or whatever, but you'll have to customize the fpc.cfg file. I'm a ways from dealing with that.

Perl 6

This is kind of fascinating in the train-wreck sense, but even aside from weirder-than-Perl5 syntax, taking a speed penalty like that in modern times is not acceptable; computers are fast enough to solve all problems in a few seconds, but the slower your program is, the more power it consumes, and that costs you real money in a data center.

I don't see any support in Perl6 for a web server, GUI, low-level graphics, sound, or even low-level UNIX libraries, you have to use NativeCall with a lot of wrapper code for every struct. Perl5 made Gtk pretty easy to use, hideous as it may be, and mod_perl made it a powerful choice on Apache servers.

What we expect from a systems programming language has changed since the '70s-'90s when Perl and its immediate ancestors were invented, and even the most rough, low-level programmers don't want to reinvent everything from assembly up anymore.

Python has multiple ways now to get faster, I'm currently trying out Cython and getting faster, compiled Python binaries, which can directly call C code (because by that point it is C code). Python's standard Tkinter GUI is primitive, but SDL works well if you can distribute it. Python's system libraries and web frameworks are top-notch.

JavaScript isn't fast, but modern runtimes are surprisingly good; I'd have to write some tests to see how Node or Electron compares to Perl 6, but I'd bet on the massive VM investments in JS. JS does everything now, it's certainly the most familiar user interface these days.

If you just want a multi-paradigm language for hacking, Scheme and Racket are ideal, supported by tons of papers and books like SICP, and they compile to fast native code.

And then there's the real outliers, like Lazarus, which is a Delphi-like Pascal IDE, or the usual "I want a hobby language" choices of Haskell, OCaml, Clojure, etc.

Even if Perl6 had come out in the first decade of its development, it would've been a little backwards, but compared to modern choices it's archaic.

I’ve avoided the Whatever Star because, in addition to making Perl 6 look like a lineal descendant of brainfuck, it is governed by rules that are too subtle for my understanding.
—Evan Miller