Julia More Packaging & Code

I don't just drink coffee or booze, watch movies and Internet drama, and look cool. I code sometimes, too! Who knew?!

Carrying on with my experiment in Julia, packaging has another step needed to make references. For instance, Ansi uses Geometry, so:

Geometry/Project.toml:


authors = ["Mark Damon Hughes "]
name = "Geometry"
uuid = "e3172796-a620-11e8-2cbf-612649bb77f8"
version = "0.1.0"

[deps]

Ansi/Project.toml:


authors = ["Mark Damon Hughes "]
name = "Ansi"
uuid = "72992c94-a620-11e8-3d05-55611ea0dbd0"
version = "0.1.0"

[deps]

Geometry = "e3172796-a620-11e8-2cbf-612649bb77f8"

Ansi/Manifest.toml:


[[Geometry]]
repo-rev = "master"
repo-url = "/Users/mdh/Code/CodeJulia/Geometry"
uuid = "e3172796-a620-11e8-2cbf-612649bb77f8"
version = "0.1.0"

All the boldface code is what I wrote/copy-pasted, the rest is generated by juliaMakePackage.zsh. I may go ahead and make a tool to link projects, because it's so error-prone. In fact, I cheated, and made a single Manifest.toml which I copy to all projects so far, and can replace whenever something updates.

Anyway, this gets me to a nice state where I can write using Ansi in my project and it'll just find it. IIUC, if I move all the libraries to a public repository, I can just change the repo-url and the packages are downloaded into ~/.julia cache somewhere.

I still haven't followed up on making a binary application; the more I look into that, the jankier it seems, more like something to defer until there's an official solution. Putting a real UI on it is also something to work on, but that's much more doable.

Coding

I've written a lot more code, over 1000 LOC, not just screwing around with packages. Mostly this is enjoyable, it's a nice systems programming language. The ugly parts haven't yet driven me insane, they're just things to work around or ignore. Far less frustrating than almost any other new language; Rusty Nail In Your Head and Go Fuck Yourself Its Google aren't my favorites.

Strong typing really is a pain in the ass. Declare a variable or struct field foo, and it takes anything. Type it with foo::AbstractString, and you soon learn nothing is not a string; foo::Union{AbstractString,Nothing} is necessary to be nullable. Ick.

Enumerations

Enumerated types @enum are disappointing. They're a little smarter than C enums, but not as useful as Java enums. They just represent a value; but you have to cast them to Int every time you use them for their value, so too painful to use them as array indices. Or as characters, a thing I like a lot for debugging. And they're not easy to reflect on:

julia> @enum Terrains begin
               Ter_Floor = Int('.')
               Ter_Wall = Int('#')
       end
julia> Ter_Wall
Ter_Wall::Terrains = 35
julia> Int(Ter_Wall)
35
julia> Char(Int(Ter_Wall))
'#': ASCII/Unicode U+0023 (category Po: Punctuation, other)
julia> String(Char(Int(Ter_Wall)))
ERROR: MethodError: no method matching String(::Char)
julia> string(Char(Int(Ter_Wall)))
"#"
julia> # FFS
julia> string(Ter_Floor)
"Ter_Floor"
julia> # Surprisingly easy!
julia> instances(Terrains)
(Ter_Floor::Terrains = 46, Ter_Wall::Terrains = 35)
julia> # Shit, this is a named tuple, not a dictionary!
julia> useful_instances = Dict()
Dict{Any,Any} with 0 entries
julia> for v in values(instances(Terrains))
           useful_instances[ string(v) ] = v
           useful_instances[ string(Char(Int(v))) ] = v
       end
julia> useful_instances
Dict{Any,Any} with 4 entries:
  "Ter_Floor" => Ter_Floor
  "Ter_Wall"  => Ter_Wall
  "#"         => Ter_Wall
  "."         => Ter_Floor
julia> # JFHC

That was an annoying adventure to get a simple reverse lookup.

Julia Local Packaging

So I wanted to move all my common Julia code to a support dir. My filesystem has for 30+ years contained:

$HOME/
    Code/
        CodeC/
            foo/
                src/
                    bar.c
        CodeJava/
            foo/
                src/
                    com/
                        mdh/
                            bar/
                                Quux.java
        et fucking cetera

Build scripts for most languages expect something very like this, and it's easy to import one package's source into another, so I could put common code in a "Marklib" project, and get work done.

Making this happen in Julia was a lot more difficult. With a little help from Slack I made sense of the terrible package documentation for Julia and the incomprehensible errors, and wrote a script juliaMakePackage.zsh:

#!/bin/zsh
if [[ $# -ne 1 ]]; then
    echo "Usage: juliaMakePackage.zsh NAME"
    exit 1
fi
name=$1
devdir=$HOME/Code/CodeJulia
cd $devdir
julia -E "using Pkg; Pkg.activate(\".\"); Pkg.generate(\"${name}\")"
cd $name
git init
git add .
git commit -m "Initial commit"
cd ..
julia -E "using Pkg; Pkg.develop(PackageSpec(url=\"${devdir}/${name}\"))"

And then added the main dir and all packages I make to ~/.julia/config/startup.jl:

# startup.jl

push!(LOAD_PATH, pwd())
push!(LOAD_PATH, "$(homedir())/Code/CodeJulia")
push!(LOAD_PATH, "$(homedir())/Code/CodeJulia/Marklib")

println("READY $(pwd())")

Now finally I can:

% julia
READY /Users/mdh
julia> using Marklib
julia> Marklib.greet()
Hello World!
julia> 

And from there start putting in my libs. Each one needs a package and a startup entry; I may have to automate that by walking my code dir. Waste of several hours figuring that out.

Julia String Concatenation

Last time, I was uncertain about string concatenation, so I did a test:

#!/usr/bin/env julia

const kTestText = "abcdefghijklmnopqrstuvwxyz0123456789\n"
const kLoops = 10000

function stringString()
    s = ""
    for i in 1:kLoops
        s = "$s$kTestText"
    end
    return s
end

function bufferString()
    sb = IOBuffer()
    for i in 1:kLoops
        print(sb, kTestText)
    end
    return String(take!(sb))
end

function vectorString()
    sb = Vector()
    for i in 1:kLoops
        push!(sb, kTestText)
    end
    return join(sb, "")
end

function typedVectorString()
    sb = Vector{AbstractString}()
    for i in 1:kLoops
        push!(sb, kTestText)
    end
    return join(sb, "")
end

println("*** stringString")
@timev stringString()
sleep(2)

println("\n*** bufferString")
@timev bufferString()
sleep(2)

println("\n*** vectorString")
@timev vectorString()
sleep(2)

println("\n*** typedVectorString")
@timev typedVectorString()

*** stringString
  1.100197 seconds (21.24 k allocations: 1.725 GiB, 15.94% gc time)
elapsed time (ns): 1100197167
gc time (ns):      175349222
bytes allocated:   1851904041
pool allocs:       11292
non-pool GC allocs:9950
GC pauses:         79

*** bufferString
  0.006864 seconds (11.80 k allocations: 1.134 MiB)
elapsed time (ns): 6864042
bytes allocated:   1189493
pool allocs:       11794
non-pool GC allocs:3
realloc() calls:   8

*** vectorString
  0.017380 seconds (26.68 k allocations: 2.191 MiB)
elapsed time (ns): 17380237
bytes allocated:   2297091
pool allocs:       26659
non-pool GC allocs:9
realloc() calls:   8

*** typedVectorString
  0.031384 seconds (44.75 k allocations: 2.999 MiB, 10.00% gc time)
elapsed time (ns): 31384383
gc time (ns):      3137654
bytes allocated:   3144221
pool allocs:       44730
non-pool GC allocs:8
realloc() calls:   8
GC pauses:         1

Well, there's me told off. I expected #1 typedVector, #2 vector, #3 buffer, then stringString way at the bottom. Instead the first 3 are reversed.

IOBuffer, as ugly as it is, is the clear winner. Vector did OK, but twice as much CPU & RAM loses. Amusing that typedVector is twice as slow and memory-heavy as the untyped (explained ). On larger loops, buffer gets slower, but vector remains a memory pig, and in GC that's unacceptable. Of course stringString is terrible, and it's almost exactly the same for string(s, kTestText).

Time to rewrite some text processing.

More Julia

Decided to take another day on Julia, write something more serious and see how that goes.

There's an uber-juno "IDE" plugin for Atom, which at least turns on syntax highlighting and puts an interactive console in the editor. Yay. It's not capable of linting yet, though it says it is.

So I'm rewriting a simple 1970s-style dungeon crawl game (Chorus:"As if you could make any other kind of game, Mark!" Mark:"I assure you I could, I just choose not to.") as a test of data structures and application programming in Julia. It's a little challenging, but not impossible.

Using Modules

The current directory is not included in the default LOAD_PATH, so you can't import local modules right off. One solution is to put it in your startup:

% mkdir -p ~/.julia/config
% echo '@everywhere push!(LOAD_PATH, pwd())' >>~/.julia/config/startup.jl

All names are in the same namespace. This means your module and a struct or method in it can't have the same names… My current solution has been to pluralize the module, so GridMaps contains a struct GridMap.

The export rules are a little annoying. If you use the @enum macro to make a ton of constants, they aren't exported when you export the enum type; you can either manually export each constant name, or just use ModuleName.ConstantName in other modules. Bleh.

Debugging

Bug #1: The terminator problem is pretty bad in Julia:

% cat Foo.jl
module Foo
for i=1:10
    println(i)
#missing end
end #module

% julia Foo.jl
ERROR: LoadError: syntax: incomplete: "module" at /Users/mdh/Code/CodeJulia/Foo.jl:1 requires end
Stacktrace:
 [1] include at ./boot.jl:317 [inlined]
 [2] include_relative(::Module, ::String) at ./loading.jl:1038
 [3] include(::Module, ::String) at ./sysimg.jl:29
 [4] exec_options(::Base.JLOptions) at ./client.jl:229
 [5] _start() at ./client.jl:421
in expression starting at /Users/mdh/Code/CodeJulia/Foo.jl:1

Good luck finding that missing end if you have a 1000-line module. C used to be just as bad about semicolons and braces, but modern compilers are pretty good at guessing where you fucked up. Python's whitespace-as-control is brilliant, because you can't ever do that. A passable solution would be each control keyword having its own unique end keyword, but it's too late for that. In the actual bug, I had to comment out half the code, run the module, uncomment and comment the other half, repeat until I isolated it.

Bug #2: Type annotations need to be very generic or left off entirely. As code-as-documentation, I declare a function as parseLine(line::String), and it gives me:

MethodError(Main.Foo.parseLine, ("a",), 0x00000000000061bb)

Well, thanks. Turns out I need to use AbstractString, because a prior function returns an AbstractString and not String. Or I can just leave the typing off, as was my first instinct.

String Concatenation

This is super ugly. Currently I've fallen back on:

sb = Vector()
push!(sb, "part of ")
push!(sb, "a string")
return join(sb, "")

Strings aren't mutable, and there's no StringBuffer/NSMutableString equivalent. The other option is to use an IOBuffer. I haven't done timings yet to see which is faster/uses less memory, I just find pushing to a vector simpler.

Switch

There's no switch statement. I could put functions in a dictionary and dispatch on that, which is not ideal for looping over a bunch of simple values, and requires passing around control flags instead of a simple break or return. Caveman solution is a chain of if/elseif, but I don't like it. Possibly a macro could be written?

Binary

Turns out you can make binaries: Julia apps on the App Store: Building and distributing an application written in Julia

It's an ugly process, that Nathan's working around, but it's a start. This should 100% be in the core libraries, and should have a cross-compiler.

Getting this working is tomorrow's problem, I think.

Julia

Interesting language, originally a math/statistics package but now as general-purpose as any lang. More or less Pythonic, though it has some type-annotation stuff, and heavily-optimized Julia looks like a mess of annotations with your code buried somewhere inside.

The Mac version comes as a dmg with an app (which I'd prefer for easy install/uninstall), or brew (which I prefer not to use). The app just launches a single command in a new Terminal window; add that path/bin to the PATH in your .profile, e.g.:

export JULIA_HOME="$HOME/Applications/Julia-1.0.app/Contents/Resources/julia"
export MANPATH="$MANPATH:$JULIA_HOME/share/man"
export PATH="$PATH:$JULIA_HOME/bin"

And now in Terminal:

% echo 'println("Hello, world!")' >hello.jl
% julia hello.jl
Hello, world!

The only way I can see to make it compile to a binary is embedding, and I'm not clear on how you package that with a full Julia distribution yet. That's unfortunate. I like REPL workflows as much as anyone, but binaries are what "normal" people run.

Getting Started

<voice tone="excessively chipper"> Let's read the manual! </voice>

Syntax is nicer than usual: function/end, if/elseif/else/end, for/end, while/end, begin/end, let/end, which beats the hell out of Python's def, if/elif/else; to say nothing of abominations like Swift's "func". No do/while loop, which is annoying especially for file processing, but I suspect that can be fixed with macros.

There's a lot of ways to write functions, which is nice but allows some ugly choices. Anonymous functions are x->x^2 or function(x) x^2 end; named functions can just be assigned f(x)=x^2 or written in full:

function f(x)
    return x^2
end

Whitespace is not significant, and indentation is not enforced, which is a major bummer for style-enforcing-structure, but I'm sure sloppy jerks will love that.

You can use tuples for multiple returns, or as ad-hoc structures:

> point(x, y) = (x=x, y=y)
point (generic function with 1 method)
> p = point(13, 2)
(x = 13, y = 2)
> a, b = p
(x = 13, y = 2)
> a
13

It's pass-by-reference, not copying, so be careful with mutable data.

My only real kvetch so far is that arrays are 1-indexed and column-major, like FORTRAN, not 0-indexed and row-major, like C. For a numeric package, that makes sense, but for other programming tasks it's frustrating and error-prone, see EWD 831.

This is a functional language, and there are no classes/inheritance/methods, however "methods" are functions which are overloaded based on types, and can be used like class methods:

> quack(x::Int64) = "int $x"
> quack(x::Float64) = "float $x"
> quack(1)
"int 1"
> quack(6.66)
"float 6.66"

As well, you can use closures to make pseudo-classes, the same way you do in Scheme:

let state = 0
    global counter() = (state += 1)
    global counterReset() = (state = 0)
end

struct (immutable) and mutable struct make "Composite types", and can make objects the usual way:

struct Point
    x
    y
end
pointDist(p::Point) = sqrt(p.x^2 + p.y^2)
Base.show(io::IO, p::Point) = print(io, "{$(p.x),$(p.y)}")

> p = Point(6, 6)
{6,6}
> pointDist(p)
8.48528137423857

The default constructor can be overridden at the end of the field list, it's defined as Point(x,y) = new(x,y). The "toString" equivalent there is ugly as hell, but there's a ton of options for overloading it by type of output.

There's a lot of fucking around with generics and strong typing (for weak minds), but ignore all that crap.

Interfaces are a somewhat messy use of several methods to create pseudo-types; define the basic interface methods for your type, and most things calling those interface methods will work. So, a couple iter() functions and you have an iterable, and so on. This would work much better if Julia had an actual OOP class system and real interfaces, but Python half-asses interfaces the same way and aside from being 1000x slower than you'd like, it gets by.

Quickly skimming modules, seems pretty standard import mechanism, but I don't see any way to make something private. OK, I'm bored of reading docs. Let's do something. Something semi-practical here, my standard RPN calculator, one command per line.

Docs/libraries are kind of a mess, Vector is discussed in Base.Arrays, push!/pop! methods are discussed in Collections (an interface). parse is under Numbers, not Strings, as one might expect.

eof() does the extremely unfortunate thing of blocking for input, so it's utterly useless in a main interactive loop.

… About 30 minutes later, I have a working, final version. Well, that was pretty easy, and it's a clean implementation, other than the interactive loop.

Next time I open this, I'll put it in a module, and tokenize the line instead of requiring just one token per line, and have some command-line argument to suppress help and prompts.

I should also investigate IJulia which is a Jupyter notebook, which seems like the "expected" way to make it interactive and handle graphics or media.

RPNCalc.jl

#!/usr/bin/env julia
# RPNCalc.jl
# Copyright ©2018 by Mark Damon Hughes. All Rights Reserved.

stack = Vector()

function checkStack(n)
    if length(stack) < n
        error("Stack underflow: Needs $n values")
    end
end

function parseLine(s)
    s = strip(s)
    if s == "+"
        checkStack(2)
        b = pop!(stack); a = pop!(stack)
        push!(stack, a + b)
    elseif s == "-"
        checkStack(2)
        b = pop!(stack); a = pop!(stack)
        push!(stack, a - b)
    elseif s == "*"
        checkStack(2)
        b = pop!(stack); a = pop!(stack)
        push!(stack, a * b)
    elseif s == "/"
        checkStack(2)
        b = pop!(stack); a = pop!(stack)
        push!(stack, a / b)
    elseif s == "="
        checkStack(1)
        println(stack[end])
    else
        push!(stack, parse(Float64, s) )
    end
end

function main()
    println("RPN Calc: Type numbers or operators (+, -, *, /) one at a time, = to show top of the stack, ^D to end.")
    while true
        print("> "); flush(stdout)
        s = readline(stdin, keep=true)
        if s == ""
            println("Goodbye")
            break
        end
        try
            parseLine(s)
        catch e
            println(e.msg)
        end
    end
end

main()

100 Days of Programmer Burnout

This is a fantastic way to make someone hate computers and programming, and take up a life of farming and/or sex work so they never have to see a computer again. (Which, sometimes doesn't seem such a bad idea, anyone want an aging gigolo in between potato plantings?)

When I was trying to do the Advent of Code, I couldn't do half the days because life got in the way. 100 days would be homicide-inducing, not habit-forming.

Here's my challenge: Program when you feel like it, when you have some problem amenable to computational solution. Take the weekend off; or if it's not your day job, only program on the weekends. Spend as many minutes or hours as it takes. For the love of fuck, do not commit to a Microsoft® Github® log, what is wrong with these people?!

Bring Out the Type System

By the way, about void-safety: for a decade now, Eiffel has been void-safe, meaning a compile-time guarantee of no run-time null pointer dereferencing. It is beyond my understanding how the rest of the world can still live with programs that run under myriad swords of Damocles: x.op (…) calls that might any minute, without any warning or precedent, hit a null x and crash.
—Bertrand Meyer, Why not program right?

I knew this would be exasperating, but really now. At this point, my eyes rolled completely out of my head and I no longer have eyes. 🙄

References don't just randomly become null without warning. You chose to call a function that might return null, and didn't bother to put in an if or assert when that's a possibility. Typically the exception system catches it if you do miss it.

The Objective-C model of nil messaging just returning nil or 0 was theoretically dangerous, but in practice incredibly useful. Crashing out in Javascript means I have to wrap everything with (x ? x.op() : null) to get the same effect, which might require a lot of temp vars.

Do type devotees actually believe in randomly-appearing errors, or that dynamic programmers just flail our limbs on a keyboard until something manages to pass tests, or do they just exaggerate a rare edge case they saw once, or are they completely fabricating this stuff to justify their waste of time/perversion?

Type systems are self-inflicted BDSM, and it is not self-evident that everyone wants to wear a gimp suit.

Dr Dobb's Journal of Computer Calisthenics & Orthodontia

DDJ, especially the run 1984 to 2000-ish, is how I learned C and assembly, and much of my attitude towards software.

These days I read PragPub ed. by Michael Swaine of DDJ infamy, though it's more architect/software manager-oriented than in-the-trenches bit-eating, but it still has some real working code.

Also, I really miss Jolt Cola, like a detoxing junkie misses a needle.

Python 3.7

  • Python 3.7 released: Standard checklist:
    • Run installer
    • Delete the old 3.6 folder from /Applications
    • Run the certificate command in the new 3.7 folder (the other shits a PATH into my shell profile, don't need it)
    • Run IDLE and verify it's 3.7.0. Happily, no longer have to fight with updating Tcl/Tk.
    • Run "python3" from Terminal and verify it's 3.7.0
    • Run a random Python script to make sure nothing's broken.

Nanosecond-accurate time functions and switching more ASCII/C Locale into UTF-8 are nice improvements, but those are more patching up legacy annoyances than "must have".

I'm mostly interested in dataclasses, which makes it much easier to build little struct-type objects instead of random dicts or lists which have all sorts of problems (no equality, hashing, typo-safety).

I greatly dislike the addition of BDSM typing, but it's mostly optional, EXCEPT you have to use them in dataclasses:

from dataclasses import dataclass
@dataclass
class Point:
    x : float = 0.0
    y : float = 0.0

>>> p = Point()
>>> p
Point(x=0.0, y=0.0)
>>> q = Point(1.1, 2.2)
>>> q
Point(x=1.1, y=2.2)

If I define Point without the type annoytations[my new favorite typo!], only the default constructor works, and it doesn't print the fields as a string.

@dataclass
class Pointless:
    x = 0.0
    y = 0.0

>>> f = Pointless()
>>> f
Pointless()
>>> f.x
0.0
>>> f.y
0.0

Real examples might be a lot more complex than a point, and by then the cost of building a proper class with __init__ and everything yourself isn't such a big deal, so I can see dataclasses mostly being used for very simple struct-like containers.