Tuesday, November 25, 2008

Unicode in Squeak

(Excuse the odd formatting; blogger.com isn't liking my Unicode symbols in this post and is doing odd things with line heights)

Somebody on the Squeak mailing list asked about how to do an open Interval. I came up with:

1 to: ∞

I had added "∞" as a global variable equal to "Float infinity" and my example... just worked! When I was looking at the character table for the infinity sign, I came across a ton of other gems. These would be all quite possible in Squeak, and many of them trivial.

( (22 ÷ 7) ≈ (π ± ¼) ) → true or false, π and ¼ are constants and ± would return an object representing a numeric accuracy object of some sort.
((c ∪ d) ⊂ e) → true or false for collections c, d, and e.
(a ∧ b ∨ c) ¬ " The not-sign needs to come after expressions. "

1 … 3 → returns an interval. The ellipses is a single Unicode character.
2¹⁶ → 2 raisedTo: 16.
∅ → An empty, immutable Set instance.

There are loads of symbols available that would work as constants, method selectors, variables (greek letters anybody?) and so forth. Some of them won't work, such as using a dollar-sign for currency values. I'm not sure about '∃' and '∀' because of Smalltalk's message order; these two symbols are some odd prefix-type expression.

Another useful symbol would be some sort of concatenation operator, but (not being a proper mathematician) I don't know one. This operator would allow you to easily make a collection, e.g.

#a | ∅ " A new set containing #a. This could be implemented, but '|' is already used in Boolean operations. "
varA | varB | varC | ∅ " Shorthand for making a collection. Replace '∅' to change the type of collection. "

There's a Unicode character called a "Character tie": http://www.fileformat.info/info/unicode/char/2040/index.htm. Would this make a potential concatenation operator?:

varA ⁀ varB ⁀ varC ⁀ ∅ " Meh "

3 comments:

gulik said...

What characters could be used to represent an empty immutable dictionary and empty immutable ordered collection?

An open circle arrow could be used for either: '↺'. It kind of looks like a '∅'. There's '∆' which could represent an empty ordered collection. An array needs a box of some sort, such as '▤'.

There's a bunch of enclosed operators: ⊕⊖⊗⊘⊙⊚⊛⊞⊡ etc.

So using these, you could make:
#b | #a | ∆ "an OrderedCollection"
#a→1 | #b→2 | ↺ "a Dictionary"
42 | 55 | ▤ "an Array"

Antony Blakey said...

It would, IMO, be a really bad idea to use mathematical/logical symbols in a way that doesn't match their existing meaning.

∆ already has a meaning, nothing like an ordered collection. Set union is ∪, not |, and while I'm not familiar with the mathematical uses of ↺, I'm sure it's not a good idea to simply assign new meaning to symbols on the basis of their shape.

Having said that, I'd like to use these symbols in ST properly.

Stefan Schmiedl said...

Tell me, when you've arrived at APL :-)