uxntal syntax
lab brief | what is a generative grammar for uxntal? |
keywords |
|
conducted | JL 2025/05/07–2025/05/09 |
purpose
uxntal is the primary programming language of the uxn virtual machine. While the interface of the underlying virtual machine has been frozen to further change, the language it hosts has been developed not towards a specification, but as a means to an end.
In practice, uxntal newcomers are active participants in the language's clarification and refinement, wanting (and often needing) guidance from the active ecosystem of users. While a majority of this clarification relates to the language semantics, the surface-level syntax is itself also a nuanced object which is difficult to communicate.
The purpose of this experiment is the notation of a generative grammar which shows a valid construction for every uxntal syntactic form. The rules showing these constructions are a much weaker version of the problem of notating a specification of the language: the set of generated programs being a conservative subset of all valid uxntal programs.
method
#uxn
on thelibera.chat
IRC server.#uxn
on theconcatenative
Discord server.
instrumentation
uxntal versioninguxntal assembler | drifblim rev. 9 May 2025 |
uxn emulator | uxncli rev. 19 Oct 2024 |
results
- is not to be confused with a uxntal specification,
- are valid syntactic constructions,
- is a conservative subset of valid all syntactic constructs,
- are not always meaningful,
- and rarely follow idioms of the language as used.
program
The generative grammar is given in a Bachus-Naur Form with two extensions.
First, as notation shorthands, take ␣
to be any amount of whitespace separation,
<hex1>
to be any hexadecimal character 0
–f
,
and <hexn>
a string of n
hexadecimal characters (e.g., <hex2>
encoding 00
–ff
, a byte).
Second, uxntal naming forms are defined as being not something, which resists BNF expression: instead, take as given:
- a
<STRING>
is anything but its closing␣
- a
<COMMENT>
is anything but its closing␣)
- an
<ID>
is anything but the finite sets<opcode>
and reserved hexadecimal strings<hex1>
<hex2>
<hex3>
<hex4>
.
<program> | = | concatenating |
| | <program>␣<program> | |
operation | ||
| | <opcode> | |
assembler control | ||
| | $<ID> | |
| | $<hex1> | |
| | $<hex2> | |
| | $<hex3> | |
| | $<hex4> | |
| | |<ID> | |
| | |<hex1> | |
| | |<hex2> | |
| | |<hex3> | |
| | |<hex4> | |
literals | ||
| | <ID> | |
| | <hex2> | |
| | "<STRING> | |
| | #<hex2> | |
| | #<hex4> | |
referencing | ||
| | ,<refer> | |
| | _<refer> | |
| | .<refer> | |
| | ~<refer> | |
| | ;<refer> | |
| | =<refer> | |
defining IDs | ||
| | %<ID>␣{␣<program>␣} | |
| | @<ID>␣<program> | |
| | &<ID>␣<program> | |
bracketing | ||
| | [␣<program>␣] | |
| | (␣<COMMENT>␣) | |
lambda):
<refer> | = | <ID> |
| | {␣<program>␣} | |
opcode
<core-opcode> | = | arithmetic |
| | INC | ADD | SUB | |
| | MUL | DIV | SFT | |
| | AND | ORA | EOR | |
test conditionals | ||
| | EQU | NEQ | |
| | GTH | LTH | |
stack combinators | ||
| | POP | NIP | |
| | DUP | OVR | |
| | ROT | STH | |
memory & I/O | ||
| | LIT | DEI | DEO | |
| | STZ | STR | STA | |
| | LDZ | LDR | LDA | |
jumps & control flow | ||
| | JMP | JCN | JSR | |
<core-opcodes>
are extended with four additional forms, and by modesuffixes:
<opcode> | = | BRK |
| | JCI | JMI | JSI | |
| | <core-opcode> | |
| | <core-opcode>2 | |
| | <core-opcode>k | |
| | <core-opcode>r | |
| | <core-opcode>2k | |
| | <core-opcode>2r | |
| | <core-opcode>kr | |
| | <core-opcode>2kr | |