doc/ysh-tour.md

OILS / doc / ysh-tour.md View on Github | oils.pub

1636 lines, 1119 significant

1	---
2	default_highlighter: oils-sh
3	---
4
5	A Tour of YSH
6	=============
7
8	<!-- author's note about example names
9
10	- people: alice, bob
11	- nouns: ale, bean
12	- peanut, coconut
13	- 42 for integers
14	-->
15
16	This doc describes the [YSH]($xref) language from clean slate
17	perspective. We don't assume you know Unix shell, or the compatible
18	[OSH]($xref). But shell users will see the similarity, with simplifications
19	and upgrades.
20
21	Remember, YSH is for Python and JavaScript users who avoid shell! See the
22	[project FAQ][FAQ] for more color on that.
23
24	[FAQ]: https://www.oilshell.org/blog/2021/01/why-a-new-shell.html
25
26	This document is long because it demonstrates nearly every feature of the
27	language. You may want to read it in multiple sittings, or read [The Simplest
28	Explanation of
29	Oil](https://www.oilshell.org/blog/2020/01/simplest-explanation.html) first.
30	(Until 2023, YSH was called the "Oil language".)
31
32
33	Here's a summary of what follows:
34
35	1. YSH has interleaved word, command, and expression languages.
36	- The command language has Ruby-like blocks, and the expression language
37	has Python-like data types.
38	2. YSH has both builtin commands like `cd /tmp`, and builtin functions like
39	`join()`.
40	3. Languages for data, like [JSON][], are complementary to YSH code.
41	4. OSH and YSH share both an interpreter data model and a process model
42	(provided by the Unix kernel). Understanding these common models will make
43	you both a better shell user and YSH user.
44
45	Keep these points in mind as you read the details below.
46
47	[JSON]: https://json.org
48
49	<div id="toc">
50	</div>
51
52	## Preliminaries
53
54	Start YSH just like you start bash or Python:
55
56	<!-- oils-sh below skips code block extraction, since it doesn't run -->
57
58	```sh-prompt
59	bash$ ysh # assuming it's installed
60
61	ysh$ echo 'hello world' # command typed into YSH
62	hello world
63	```
64
65	In the sections below, we'll save space by showing output in comments, with
66	`=>`:
67
68	echo 'hello world' # => hello world
69
70	Multi-line output is shown like this:
71
72	echo one
73	echo two
74	# =>
75	# one
76	# two
77
78	## Examples
79
80	### Hello World Script
81
82	You can also type commands into a file like `hello.ysh`. This is a complete
83	YSH program, which is identical to a shell program:
84
85	echo 'hello world' # => hello world
86
87	### A Taste of YSH
88
89	Unlike shell, YSH has `var` and `const` keywords:
90
91	const name = 'world' # const is rarer, used the top-level
92	echo "hello $name" # => hello world
93
94	They take rich Python-like expressions on the right:
95
96	var x = 42 # an integer, not a string
97	setvar x = x * 2 + 1 # mutate with the 'setvar' keyword
98
99	setvar x += 5 # Increment by 5
100	echo $x # => 6
101
102	var mylist = [x, 7] # two integers [6, 7]
103
104	Expressions are often surrounded by `()`:
105
106	if (x > 0) {
107	echo 'positive'
108	} # => positive
109
110	for i, item in (mylist) { # 'mylist' is a variable, not a string
111	echo "[$i] item $item"
112	}
113	# =>
114	# [0] item 6
115	# [1] item 7
116
117	YSH has Ruby-like blocks:
118
119	cd /tmp {
120	echo hi > greeting.txt # file created inside /tmp
121	echo $PWD # => /tmp
122	}
123	echo $PWD # prints the original directory
124
125	And utilities to read and write JSON:
126
127	var person = {name: 'bob', age: 42}
128	json write (person)
129	# =>
130	# {
131	# "name": "bob",
132	# "age": 42,
133	# }
134
135	echo '["str", 42]' \| json read # sets '_reply' variable by default
136
137	### Tip: Use the `=` operator interactively
138
139	The `=` keyword evaluates and prints an expression:
140
141	= _reply
142	# => (List) ["str", 42]
143
144	(Think of it like `var x = _reply`, without the `var`.)
145
146	The best way to learn YSH is to type these examples and see what happens!
147
148	## Word Language: Expressions for Strings (and Arrays)
149
150	Let's describe the word language first, and then talk about commands and
151	expressions. Words are a rich language because strings are a central
152	concept in shell.
153
154	### Unquoted Words
155
156	Words denote strings, but you often don't need to quote them:
157
158	echo hi # => hi
159
160	Quotes are useful when a string has spaces, or punctuation characters like `( )
161	;`.
162
163	### Three Kinds of String Literals
164
165	You can choose the style that's most convenient to write a given string.
166
167	#### Double-Quoted, Single-Quoted, and J8 strings (like JSON)
168
169	Double-quoted strings allow interpolation, with `$`:
170
171	var person = 'alice'
172	echo "hi $person, $(echo bye)" # => hi alice, bye
173
174	Write operators by escaping them with `\`:
175
176	echo "\$ \" \\ " # => $ " \
177
178	In single-quoted strings, all characters are literal (except `'`, which
179	can't be expressed):
180
181	echo 'c:\Program Files\' # => c:\Program Files\
182
183	If you want C-style backslash character escapes, use a J8 string, which is
184	like JSON, but with single quotes:
185
186	echo u' A is \u{41} \n line two, with backslash \\'
187	# =>
188	# A is A
189	# line two, with backslash \
190
191	The `u''` strings are guaranteed to be valid Unicode (unlike JSON). You can
192	also use `b''` strings:
193
194	echo b'byte \yff' # Byte that's not valid unicode, like \xff in C.
195	# Don't confuse it with \u{ff}.
196
197	#### Multi-line Strings
198
199	Multi-line strings are surrounded with triple quotes. They come in the same
200	three varieties, and leading whitespace is stripped in a convenient way.
201
202	sort <<< """
203	var sub: $x
204	command sub: $(echo hi)
205	expression sub: $[x + 3]
206	"""
207	# =>
208	# command sub: hi
209	# expression sub: 9
210	# var sub: 6
211
212	sort <<< '''
213	$2.00 # literal $, no interpolation
214	$1.99
215	'''
216	# =>
217	# $1.99
218	# $2.00
219
220	sort <<< u'''
221	C\tD
222	A\tB
223	''' # b''' strings also supported
224	# =>
225	# A B
226	# C D
227
228	(Use multiline strings instead of shell's [here docs]($xref:here-doc).)
229
230	### Three Kinds of Substitution
231
232	YSH has syntax for 3 types of substitution, all of which start with `$`. That
233	is, you can convert any of these things to a string:
234
235	1. Variables
236	2. The output of commands
237	3. The value of expressions
238
239	#### Variable Sub
240
241	The syntax `$a` or `${a}` converts a variable to a string:
242
243	var a = 'ale'
244	echo $a # => ale
245	echo _${a}_ # => _ale_
246	echo "_ $a _" # => _ ale _
247
248	The shell operator `:-` is occasionally useful in YSH:
249
250	echo ${not_defined:-'default'} # => default
251
252	#### Command Sub
253
254	The `$(echo hi)` syntax runs a command and captures its `stdout`:
255
256	echo $(hostname) # => example.com
257	echo "_ $(hostname) _" # => _ example.com _
258
259	#### Expression Sub
260
261	The `$[myexpr]` syntax evaluates an expression and converts it to a string:
262
263	echo $[a] # => ale
264	echo $[1 + 2 * 3] # => 7
265	echo "_ $[1 + 2 * 3] _" # => _ 7 _
266
267	<!-- TODO: safe substitution with "$[a]"html -->
268
269	### Arrays of Strings: Globs, Brace Expansion, Splicing, and Splitting
270
271	There are four constructs that evaluate to a list of strings, rather than a
272	single string.
273
274	#### Globs
275
276	Globs like `*.py` evaluate to a list of files.
277
278	touch foo.py bar.py # create the files
279	write *.py
280	# =>
281	# foo.py
282	# bar.py
283
284	If no files match, it evaluates to an empty list (`[]`).
285
286	#### Brace Expansion
287
288	The brace expansion mini-language lets you write strings without duplication:
289
290	write {alice,bob}@example.com
291	# =>
292	# alice@example.com
293	# bob@example.com
294
295	#### Splicing
296
297	The `@` operator splices an array into a command:
298
299	var myarray = :\| ale bean \|
300	write S @myarray E
301	# =>
302	# S
303	# ale
304	# bean
305	# E
306
307	You also have `@[]` to splice an expression that evaluates to a list:
308
309	write -- @[split('ale bean')]
310	# =>
311	# ale
312	# bean
313
314	Each item will be converted to a string.
315
316	#### Split Command Sub / Split Builtin Sub
317
318	There's also a variant of command sub that decodes J8 lines into a sequence
319	of strings:
320
321	write @(seq 3) # write is passed 3 args
322	# =>
323	# 1
324	# 2
325	# 3
326
327	## Command Language: I/O, Control Flow, Abstraction
328
329	### Simple Commands
330
331	A simple command is a space-separated list of words. YSH looks up the first
332	word to determine if it's a builtin command, or a user-defined `proc`.
333
334	echo 'hello world' # The shell builtin 'echo'
335
336	proc greet (name) { # Define a unit of code
337	echo "hello $name"
338	}
339
340	# The first word now resolves to the proc you defined
341	greet alice # => hello alice
342
343	If it's neither, then it's assumed to be an external command:
344
345	ls -l /tmp # The external 'ls' command
346
347	Commands accept traditional string arguments, as well as typed arguments in
348	parentheses:
349
350	# 'write' is a string arg; 'x' is a typed expression arg
351	json write (x)
352
353	<!--
354	Block args are a special kind of typed arg:
355
356	cd /tmp {
357	echo $PWD
358	}
359	-->
360
361	### Redirects
362
363	You can redirect `stdin` and `stdout` of simple commands:
364
365	echo hi > tmp.txt # write to a file
366	sort < tmp.txt
367
368	Here are the most common idioms for using `stderr` (identical to shell):
369
370	ls /tmp 2>errors.txt
371	echo 'fatal error' >&2
372
373	### ARGV and ENV
374
375	At the top level, the `ARGV` list holds the arguments passed to the shell:
376
377	var num_args = len(ARGV)
378	ls /tmp @ARGV # pass shell's arguments through
379
380	Inside a `proc` without declared parameters, `ARGV` holds the arguments passed
381	to the `proc`. (Procs are explained below.)
382
383	---
384
385	You can add to the environment of a new process with a prefix binding:
386
387	PYTHONPATH=vendor ./demo.py # os.environ will have {'PYTHONPATH': 'vendor'}
388
389	Under the hood, the prefix binding temporarily augments the `ENV` object, which
390	is the current environment.
391
392	You can also mutate the `ENV` object:
393
394	setglobal ENV.PYTHONPATH = '.'
395	./demo.py # all future invocations have a different PYTHONPATH
396	./demo.py
397
398	And get its attributes:
399
400	echo $[ENV.PYTHONPATH] # => .
401
402	### Pipelines
403
404	Pipelines are a powerful method manipulating data streams:
405
406	ls \| wc -l # count files in this directory
407	find /bin -type f \| xargs wc -l # count files in a subtree
408
409	The stream may contain (lines of) text, binary data, JSON, TSV, and more.
410	Details below.
411
412	### Multi-line Commands
413
414	The `...` prefix lets you write long commands, pipelines, and `&&` chains
415	without `\` line continuations.
416
417	... find /bin # traverse this directory and
418	-type f -a -executable # print executable files
419	\| sort -r # reverse sort
420	\| head -n 30 # limit to 30 files
421	;
422
423	When this mode is active:
424
425	- A single newline behaves like a space
426	- A blank line (two newlines in a row) is illegal, but a line that has only a
427	comment is allowed. This prevents confusion if you forget the `;`
428	terminator.
429
430	### `var`, `setvar`, `const` to Declare and Mutate
431
432	Constants can't be modified:
433
434	const myconst = 'mystr'
435	# setvar myconst = 'foo' would be an error
436
437	Modify variables with the `setvar` keyword:
438
439	var num_beans = 12
440	setvar num_beans = 13
441
442	A more complex example:
443
444	var d = {name: 'bob', age: 42} # dict literal
445	setvar d.name = 'alice' # d.name is a synonym for d['name']
446	echo $[d.name] # => alice
447
448	That's most of what you need to know about assignments. Advanced users may
449	want to use `setglobal` or `call myplace->setValue(42)` in certain situations.
450
451	<!--
452	var g = 1
453	var h = 2
454	proc demo(:out) {
455	setglobal g = 42
456	setref out = 43
457	}
458	demo :h # pass a reference to h
459	echo "$g $h" # => 42 43
460	-->
461
462	More info: [Variable Declaration and Mutation](variables.html).
463
464	### `for` Loop
465
466	#### Words
467
468	Shell-style for loops iterate over words:
469
470	for word in 'oils' $num_beans {pea,coco}nut {
471	echo $word
472	}
473	# =>
474	# oils
475	# 13
476	# peanut
477	# coconut
478
479	You can ask for the loop index with `i,`:
480
481	for i, word in README.md *.py {
482	echo "$i - $word"
483	}
484	# =>
485	# 0 - README.md
486	# 1 - __init__.py
487
488	#### Typed Data
489
490	To iterate over a typed data, use parentheses around an expression. The
491	expression should evaluate to an integer `Range`, `List`, `Dict`, or `io.stdin`.
492
493	Range:
494
495	for i in (3 ..< 5) { # range operator ..<
496	echo "i = $i"
497	}
498	# =>
499	# i = 3
500	# i = 4
501
502	List:
503
504	var foods = ['ale', 'bean']
505	for item in (foods) {
506	echo $item
507	}
508	# =>
509	# ale
510	# bean
511
512	Again, you can request the index with `for i, item in ...`.
513
514	---
515
516	There are three ways of iterating over a `Dict`:
517
518	var mydict = {pea: 42, nut: 10}
519	for key in (mydict) {
520	echo $key
521	}
522	# =>
523	# pea
524	# nut
525
526	for key, value in (mydict) {
527	echo "$key $value"
528	}
529	# =>
530	# pea - 42
531	# nut - 10
532
533	for i, key, value in (mydict) {
534	echo "$i $key $value"
535	}
536	# =>
537	# 0 - pea - 42
538	# 1 - nut - 10
539
540	That is, if you ask for two things, you'll get the key and value. If you ask
541	for three, you'll also get the index.
542
543	(One way to think of it: `for` loops in YSH have the functionality Python's
544	`enumerate()`, `items()`, `keys()`, and `values()`.)
545
546	---
547
548	The `io.stdin` object iterates over lines:
549
550	for line in (io.stdin) {
551	echo $line
552	}
553	# lines are buffered, so it's much faster than `while read --raw-line`
554
555	<!--
556	TODO: Str loop should give you the (UTF-8 offset, rune)
557	Or maybe just UTF-8 offset? Decoding errors could be exceptions, or Unicode
558	replacement.
559	-->
560
561	### `while` Loop
562
563	While loops can use a command as the termination condition:
564
565	while test --file lock {
566	sleep 1
567	}
568
569	Or an expression, which is surrounded in `()`:
570
571	var i = 3
572	while (i < 6) {
573	echo "i = $i"
574	setvar i += 1
575	}
576	# =>
577	# i = 3
578	# i = 4
579	# i = 5
580
581	### Conditionals
582
583	#### `if elif`
584
585	If statements test the exit code of a command, and have optional `elif` and
586	`else` clauses:
587
588	if test --file foo {
589	echo 'foo is a file'
590	rm --verbose foo # delete it
591	} elif test --dir foo {
592	echo 'foo is a directory'
593	} else {
594	echo 'neither'
595	}
596
597	Invert the exit code with `!`:
598
599	if ! grep alice /etc/passwd {
600	echo 'alice is not a user'
601	}
602
603	As with `while` loops, the condition can also be an expression wrapped in
604	`()`:
605
606	if (num_beans > 0) {
607	echo 'so many beans'
608	}
609
610	var done = false
611	if (not done) { # negate with 'not' operator (contrast with !)
612	echo "we aren't done"
613	}
614
615	#### `case`
616
617	The case statement is a series of conditionals and executable blocks. The
618	condition can be either an unquoted glob pattern like `*.py`, an eggex pattern
619	like `/d+/`, or a typed expression like `(42)`:
620
621	var s = 'README.md'
622	case (s) {
623	*.py { echo 'Python' }
624	.cc \| .h { echo 'C++' }
625	* { echo 'Other' }
626	}
627	# => Other
628
629	case (s) {
630	/ dot* '.md' / { echo 'Markdown' }
631	(30 + 12) { echo 'the integer 42' }
632	(else) { echo 'neither' }
633	}
634	# => Markdown
635
636
637	<!--
638	(Shell style like `if foo; then ... fi` and `case $x in ... esac` is also
639	legal, but discouraged in YSH code.)
640	-->
641
642	### Error Handling
643
644	If statements are also used for error handling. Builtins and external
645	commands use this style:
646
647	if ! test -d /bin {
648	echo 'not a directory'
649	}
650
651	if ! cp foo /tmp {
652	echo 'error copying' # any non-zero status
653	}
654
655	Procs use this style (because of shell's disabled `errexit` quirk):
656
657	try {
658	myproc
659	}
660	if failed {
661	echo 'failed'
662	}
663
664	For a complete list of examples, see [YSH Error
665	Handling](ysh-error.html). For design goals and a reference, see [YSH
666	Fixes Shell's Error Handling](error-handling.html).
667
668	#### exit, break, continue, return
669
670	The `exit` keyword exits a process. (It's not a shell builtin.)
671
672	The other 3 control flow keywords behave like they do in Python and JavaScript.
673
674	### Shell-like `proc`
675
676	You can define units of code with the `proc` keyword. A `proc` is like a
677	procedure or process.
678
679	proc my-ls {
680	ls -a -l @ARGV # pass args through
681	}
682
683	Simple procs like this are invoked like a shell command:
684
685	my-ls /dev/null /etc/passwd
686
687	You can name the parameters, and add a doc comment with `###`:
688
689	proc mycopy (src, dest) {
690	### Copy verbosely
691
692	mkdir -p $dest
693	cp --verbose $src $dest
694	}
695	touch log.txt
696	mycopy log.txt /tmp # first word 'mycopy' is a proc
697
698	Procs have many features, including four kinds of arguments:
699
700	1. Word args (which are always strings)
701	1. Typed, positional args
702	1. Typed, named args
703	1. A final block argument, which may be written with `{ }`.
704
705	At the call site, they can look like any of these forms:
706
707	ls /tmp # word arg
708
709	json write (d) # word arg, then positional arg
710
711	try {
712	error 'failed' (status=9) # word arg, then named arg
713	}
714
715	cd /tmp { echo $PWD } # word arg, then block arg
716
717	pp value ([1, 2]) # positional, typed arg
718
719	<!-- TODO: lazy arg list: ls8 \| where [age > 10] -->
720
721	At the definition site, the kinds of parameters are separated with `;`, similar
722	to the Julia language:
723
724	proc p2 (word1, word2; pos1, pos2, ...rest_pos) {
725	echo "$word1 $word2 $[pos1 + pos2]"
726	json write (rest_pos)
727	}
728
729	proc p3 (w ; ; named1, named2, ...rest_named; block) {
730	echo "$w $[named1 + named2]"
731	call io->eval(block)
732	json write (rest_named)
733	}
734
735	proc p4 (; ; ; block) {
736	call io->eval(block)
737	}
738
739	YSH also has Python-like functions defined with `func`. These are part of the
740	expression language, which we'll see later.
741
742	For more info, see the [Guide to Procs and Funcs](proc-func.html).
743
744	### Ruby-like Block Arguments
745
746	A block is a value of type `Command`. For example, `shopt` is a builtin
747	command that takes a block argument:
748
749	shopt --unset errexit { # ignore errors
750	cp ale /tmp
751	cp bean /bin
752	}
753
754	In this case, the block doesn't form a new scope.
755
756	#### Block Scope / Closures
757
758	However, by default, block arguments capture the frame they're defined in.
759	This means they obey lexical scope.
760
761	Consider this proc, which accepts a block, and runs it:
762
763	proc do-it (; ; ; block) {
764	call io->eval(block)
765	}
766
767	When the block arg is passed, the enclosing stack frame is captured. This
768	means that code inside the block can use variables in the captured frame:
769
770	var x = 42
771	do-it {
772	echo "x = $x" # outer x is visible LATER, when the block is run
773	}
774
775	- [Feature Index: Closures](ref/feature-index.html#Closures)
776
777	### Builtin Commands
778
779	Shell builtins like `cd` and `read` are the "standard library" of the
780	command language. Each one takes various flags:
781
782	cd -L . # follow symlinks
783
784	echo foo \| read --all # read all of stdin
785
786	Here are some categories of builtin:
787
788	- I/O: `echo write read`
789	- File system: `cd test`
790	- Processes: `fork wait forkwait exec`
791	- Interpreter settings: `shopt shvar`
792	- Meta: `command builtin runproc type eval`
793
794	<!-- TODO: Link to a comprehensive list of builtins -->
795
796	## Expression Language: Python-like Types
797
798	YSH expressions look and behave more like Python or JavaScript than shell. For
799	example, we write `if (x < y)` instead of `if [ $x -lt $y ]`. Expressions are
800	usually surrounded by `( )`.
801
802	At runtime, variables like `x` and `y` are bounded to typed data, like
803	integers, floats, strings, lists, and dicts.
804
805	<!--
806	[Command vs. Expression Mode](command-vs-expression-mode.html) may help you
807	understand how YSH is parsed.
808	-->
809
810	### Python-like `func`
811
812	At the end of the Command Language, we saw that procs are shell-like units of
813	code. YSH also has Python-like functions, which are different than
814	`procs`:
815
816	- They're defined with the `func` keyword.
817	- They're called in expressions, not in commands.
818	- They're pure, and live in the interior of a process.
819	- In contrast, procs usually perform I/O, and have exterior boundaries.
820
821	The simplest function is:
822
823	func identity(x) {
824	return (x) # parens required for typed return
825	}
826
827	A more complex pure function:
828
829	func myRepeat(s, n; special=false) { # positional; named params
830	var parts = []
831	for i in (0 ..< n) {
832	append $s (parts)
833	}
834	var result = join(parts)
835
836	if (special) {
837	return ("$result !!")
838	} else {
839	return (result)
840	}
841	}
842
843	echo $[myRepeat('z', 3)] # => zzz
844
845	echo $[myRepeat('z', 3, special=true)] # => zzz !!
846
847	A function that mutates its argument:
848
849	func popTwice(mylist) {
850	call mylist->pop()
851	call mylist->pop()
852	}
853
854	var mylist = [3, 4]
855
856	# The call keyword is an "adapter" between commands and expressions,
857	# like the = keyword.
858	call popTwice(mylist)
859
860
861	Funcs are named using `camelCase`, while procs use `kebab-case`. See the
862	[Style Guide](style-guide.html) for more conventions.
863
864	#### Builtin Functions
865
866	In addition, to builtin commands, YSH has Python-like builtin functions.
867	These are like the "standard library" for the expression language. Examples:
868
869	- Functions that take multiple types: `len() type()`
870	- Conversions: `bool() int() float() str() list() ...`
871	- Explicit word evaluation: `split() join() glob() maybe()`
872
873	<!-- TODO: Make a comprehensive list of func builtins. -->
874
875
876	### Data Types: `Int`, `Str`, `List`, `Dict`, `Obj`, ...
877
878	YSH has data types, each with an expression syntax and associated methods.
879
880	### Methods
881
882	Non-mutating methods are looked up with the `.` operator:
883
884	var line = ' ale bean '
885	var caps = line.trim().upper() # 'ALE BEAN'
886
887	Mutating methods are looked up with a thin arrow `->`:
888
889	var foods = ['ale', 'bean']
890	var last = foods->pop() # bean
891	write @foods # => ale
892
893	You can ignore the return value with the `call` keyword:
894
895	call foods->pop()
896
897	That is, YSH adds mutable data structures to shell, so we have a special syntax
898	for mutation.
899
900	---
901
902	You can also chain functions with a fat arrow `=>`:
903
904	var trimmed = line.trim() => upper() # 'ALE BEAN'
905
906	The `=>` operator allows functions to appear in a natural left-to-right order,
907	like methods.
908
909	# list() is a free function taking one arg
910	# join() is a free function taking two args
911	var x = {k1: 42, k2: 43} => list() => join('/') # 'K1/K2'
912
913	---
914
915	Now let's go through the data types in YSH. We'll show the syntax for
916	literals, and what methods they have.
917
918	#### Null and Bool
919
920	YSH uses JavaScript-like spellings these three "atoms":
921
922	var x = null
923
924	var b1, b2 = true, false
925
926	if (b1) {
927	echo 'yes'
928	} # => yes
929
930
931	#### Int
932
933	There are many ways to write integers:
934
935	var small, big = 42, 65_536
936	echo "$small $big" # => 42 65536
937
938	var hex, octal, binary = 0x0001_0000, 0o755, 0b0001_0101
939	echo "$hex $octal $binary" # => 65536 493 21
940
941	<!--
942	"Runes" are integers that represent Unicode code points. They're not common in
943	YSH code, but can make certain string algorithms more readable.
944
945	# Pound rune literals are similar to ord('A')
946	const a = #'A'
947
948	# Backslash rune literals can appear outside of quotes
949	const newline = \n # Remember this is an integer
950	const backslash = \\ # ditto
951
952	# Unicode rune literal is syntactic sugar for 0x3bc
953	const mu = \u{3bc}
954
955	echo "chars $a $newline $backslash $mu" # => chars 65 10 92 956
956	-->
957
958	#### Float
959
960	Floats are written with a decimal point:
961
962	var big = 3.14
963
964	You can use scientific notation, as in Python:
965
966	var small = 1.5e-10
967
968	#### Str
969
970	See the section above on Three Kinds of String Literals. It described
971	`'single quoted'`, `"double ${quoted}"`, and `u'J8-style\n'` strings; as well
972	as their multiline variants.
973
974	Strings are UTF-8 encoded in memory, like strings in the [Go
975	language](https://golang.org). There isn't a separate string and unicode type,
976	as in Python.
977
978	Strings are immutable, as in Python and JavaScript. This means they only
979	have transforming methods:
980
981	var x = s.trim()
982
983	Other methods:
984
985	- `trimLeft() trimRight()`
986	- `trimPrefix() trimSuffix()`
987	- `upper() lower()`
988	- `search() leftMatch()` - pattern matching
989	- `replace() split()`
990
991	#### List (and Arrays)
992
993	All lists can be expressed with Python-like literals:
994
995	var foods = ['ale', 'bean', 'corn']
996	var recursive = [1, [2, 3]]
997
998	As a special case, list of strings are called arrays. It's often more
999	convenient to write them with shell-like literals:
1000
1001	# No quotes or commas
1002	var foods = :\| ale bean corn \|
1003
1004	# You can use the word language here
1005	var other = :\| foo $s *.py {alice,bob}@example.com \|
1006
1007	Lists are mutable, as in Python and JavaScript. So they mainly have
1008	mutating methods:
1009
1010	call foods->reverse()
1011	write -- @foods
1012	# =>
1013	# corn
1014	# bean
1015	# ale
1016
1017	#### Dict
1018
1019	Dicts use syntax that's like JavaScript. Here's a dict literal:
1020
1021	var d = {
1022	name: 'bob', # unquoted keys are allowed
1023	age: 42,
1024	'key with spaces': 'val'
1025	}
1026
1027	You can use either `[]` or `.` to retrieve a value, given a key:
1028
1029	var v1 = d['name']
1030	var v2 = d.name # shorthand for the above
1031	var v3 = d['key with spaces'] # no shorthand for this
1032
1033	(If the key doesn't exist, an error is raised.)
1034
1035	You can change Dict values with the same 2 syntaxes:
1036
1037	setvar d['name'] = 'other'
1038	setvar d.name = 'fun'
1039
1040	---
1041
1042	If you want to compute a key name, use an expression inside `[]`:
1043
1044	var key = 'alice'
1045	var d2 = {[key ++ '_z']: 'ZZZ'} # Computed key name
1046	echo $[d2.alice_z] # => ZZZ
1047
1048	If you omit the value, its taken from a variable of the same name:
1049
1050	var d3 = {key} # like {key: key}
1051	echo "name is $[d3.key]" # => name is alice
1052
1053	More examples:
1054
1055	var empty = {}
1056	echo $[len(empty)] # => 0
1057
1058	The `keys()` and `values()` methods return new `List` objects:
1059
1060	var keys = keys(d2) # => alice_z
1061	var vals = values(d3) # => alice
1062
1063	#### Obj
1064
1065	YSH has an `Obj` type that bundles code and data. (In contrast, JSON
1066	messages are pure data, not objects.)
1067
1068	The main purpose of objects is polymorphism:
1069
1070	var obj = makeMyObject(42) # I don't know what it looks like inside
1071
1072	echo $[obj.myMethod()] # But I can perform abstract operations
1073
1074	call obj->mutatingMethod() # Mutation is considered special, with ->
1075
1076	YSH objects are similar to Lua and JavaScript objects. They can be thought of
1077	as a linked list of `Dict` instances.
1078
1079	Or you can say they have a `Dict` of properties, and a recursive "prototype
1080	chain" that is also an `Obj`.
1081
1082	- [Feature Index: Objects](ref/feature-index.html#Objects)
1083
1084	### `Place` type / "out params"
1085
1086	The `read` builtin can set an implicit variable `_reply`:
1087
1088	whoami \| read --all # sets _reply
1089
1090	Or you can pass a `value.Place`, created with `&`
1091
1092	var x # implicitly initialized to null
1093	whoami \| read --all (&x) # mutate this "place"
1094	echo who=$x # => who=andy
1095
1096	<!--
1097	#### Quotation Types: value.Command (Block) and value.Expr
1098
1099	These types are for reflection on YSH code. Most YSH programs won't use them
1100	directly.
1101
1102	- `Command`: an unevaluated code block.
1103	- rarely-used literal: `^(ls \| wc -l)`
1104	- `Expr`: an unevaluated expression.
1105	- rarely-used literal: `^[42 + a[i]]`
1106	-->
1107
1108	### Operators
1109
1110	YSH operators are generally the same as in Python:
1111
1112	if (10 <= num_beans and num_beans < 20) {
1113	echo 'enough'
1114	} # => enough
1115
1116	YSH has a few operators that aren't in Python. Equality can be approximate or
1117	exact:
1118
1119	var n = ' 42 '
1120	if (n ~== 42) {
1121	echo 'equal after stripping whitespace and type conversion'
1122	} # => equal after stripping whitespace type conversion
1123
1124	if (n === 42) {
1125	echo "not reached because strings and ints aren't equal"
1126	}
1127
1128	<!-- TODO: is n === 42 a type error? -->
1129
1130	Pattern matching can be done with globs (`~~` and `!~~`)
1131
1132	const filename = 'foo.py'
1133	if (filename ~~ '*.py') {
1134	echo 'Python'
1135	} # => Python
1136
1137	if (filename !~~ '*.sh') {
1138	echo 'not shell'
1139	} # => not shell
1140
1141	or regular expressions (`~` and `!~`). See the Eggex section below for an
1142	example of the latter.
1143
1144	Concatenation is `++` rather than `+` because it avoids confusion in the
1145	presence of type conversion:
1146
1147	var n = 42 + 1 # string plus int does implicit conversion
1148	echo $n # => 43
1149
1150	var y = 'ale ' ++ "bean $n" # concatenation
1151	echo $y # => ale bean 43
1152
1153	<!--
1154	TODO: change example above
1155	var n = '42' + 1 # string plus int does implicit conversion
1156	-->
1157
1158	<!--
1159
1160	#### Summary of Operators
1161
1162	- Arithmetic: `+ - * / // %` and `**` for exponentatiation
1163	- `/` always yields a float, and `//` is integer division
1164	- Bitwise: `& \| ^ ~`
1165	- Logical: `and or not`
1166	- Comparison: `== < > <= >= in 'not in'`
1167	- Approximate equality: `~==`
1168	- Eggex and glob match: `~ !~ ~~ !~~`
1169	- Ternary: `1 if x else 0`
1170	- Index and slice: `mylist[3]` and `mylist[1:3]`
1171	- `mydict->key` is a shortcut for `mydict['key']`
1172	- Function calls
1173	- free: `f(x, y)`
1174	- transformations and chaining: `s => startWith('prefix')`
1175	- mutating methods: `mylist->pop()`
1176	- String and List: `++` for concatenation
1177	- This is a separate operator because the addition operator `+` does
1178	string-to-int conversion
1179
1180	TODO: What about list comprehensions?
1181	-->
1182
1183	### Egg Expressions (YSH Regexes)
1184
1185	An Eggex is a YSH expression that denotes a regular expression. Eggexes
1186	translate to POSIX ERE syntax, for use with tools like `egrep`, `awk`, and `sed
1187	--regexp-extended` (GNU only).
1188
1189	They're designed to be readable and composable. Example:
1190
1191	var D = / digit{1,3} /
1192	var ip_pattern = / D '.' D '.' D '.' D'.' /
1193
1194	var z = '192.168.0.1'
1195	if (z ~ ip_pattern) { # Use the ~ operator to match
1196	echo "$z looks like an IP address"
1197	} # => 192.168.0.1 looks like an IP address
1198
1199	if (z !~ / '.255' %end /) {
1200	echo "doesn't end with .255"
1201	} # => doesn't end with .255"
1202
1203	See the [Egg Expressions doc](eggex.html) for details.
1204
1205	## Interlude
1206
1207	Before moving onto other YSH features, let's review what we've seen.
1208
1209	### Three Interleaved Languages
1210
1211	Here are the languages we saw in the last 3 sections:
1212
1213	1. Words evaluate to a string, or list of strings. This includes:
1214	- literals like `'mystr'`
1215	- substitutions like `${x}` and `$(hostname)`
1216	- globs like `*.sh`
1217	2. Commands are used for
1218	- I/O: pipelines, builtins like `read`
1219	- control flow: `if`, `for`
1220	- abstraction: `proc`
1221	3. Expressions on typed data are borrowed from Python, with influence from
1222	JavaScript:
1223	- Lists: `['ale', 'bean']` or `:\| ale bean \|`
1224	- Dicts: `{name: 'bob', age: 42}`
1225	- Functions: `split('ale bean')` and `join(['pea', 'nut'])`
1226
1227	### How Do They Work Together?
1228
1229	Here are two examples:
1230
1231	(1) In this this command, there are four words. The fourth word is an
1232	expression sub `$[]`.
1233
1234	write hello $name $[d['age'] + 1]
1235	# =>
1236	# hello
1237	# world
1238	# 43
1239
1240	(2) In this assignment, the expression on the right hand side of `=`
1241	concatenates two strings. The first string is a literal, and the second is a
1242	command sub.
1243
1244	var food = 'ale ' ++ $(echo bean \| tr a-z A-Z)
1245	write $food # => ale BEAN
1246
1247	So words, commands, and expressions are mutually recursive. If you're a
1248	conceptual person, skimming [Syntactic Concepts](syntactic-concepts.html) may
1249	help you understand this on a deeper level.
1250
1251	<!--
1252	One way to think about these sublanguages is to note that the `\|` character
1253	means something different in each context:
1254
1255	- In the command language, it's the pipeline operator, as in `ls \| wc -l`
1256	- In the word language, it's only valid in a literal string like `'\|'`, `"\|"`,
1257	or `\\|`. (It's also used in `${x\|html}`, which formats a string.)
1258	- In the expression language, it's the bitwise OR operator, as in Python and
1259	JavaScript.
1260	-->
1261
1262	---
1263
1264	Let's move on from talking about code, and talk about data.
1265
1266	## Data Notation / Interchange Formats
1267
1268	In YSH, you can read and write data languages based on [JSON]($xref). This is
1269	a primary way to exchange messages between Unix processes.
1270
1271	Instead of being executed, like our command/word/expression languages,
1272	these languages parsed as data structures.
1273
1274	<!-- TODO: Link to slogans, fallacies, and concepts -->
1275
1276	### UTF-8
1277
1278	UTF-8 is the foundation of our data notation. It's the most common Unicode
1279	encoding, and the most consistent:
1280
1281	var x = u'hello \u{1f642}' # store a UTF-8 string in memory
1282	echo $x # send UTF-8 to stdout
1283
1284	hello 🙂
1285
1286	<!-- TODO: there's a runes() iterator which gives integer offsets, usable for
1287	slicing -->
1288
1289	### JSON
1290
1291	JSON messages are UTF-8 text. You can encode and decode JSON with functions
1292	(`func` style):
1293
1294	var message = toJson({x: 42}) # => (Str) '{"x": 42}'
1295	var mydict = fromJson('{"x": 42}') # => (Dict) {x: 42}
1296
1297	Or with commands (`proc` style):
1298
1299	json write ({x: 42}) > foo.json # writes '{"x": 42}'
1300
1301	json read (&mydict) < foo.json # create var
1302	= mydict # => (Dict) {x: 42}
1303
1304	### J8 Notation
1305
1306	But JSON isn't quite enough for a principled shell.
1307
1308	- Traditional Unix tools like `grep` and `awk` operate on streams of lines.
1309	In YSH, to avoid data-dependent bugs, we want a reliable way of quoting
1310	lines.
1311	- In YSH, we also want to represent binary data, not just text. When you
1312	read a Unix file, it may or may not be text.
1313
1314	So we borrow JSON-style strings, and create [J8 Notation][]. Slogans:
1315
1316	- Deconstructing and Augmenting JSON
1317	- Fixing the JSON-Unix Mismatch
1318
1319	[J8 Notation]: $xref:j8-notation
1320
1321	#### J8 Lines
1322
1323	J8 Lines are a building block of J8 Notation. If you have a file
1324	`lines.txt`:
1325
1326	<pre>
1327	doc/hello.md
1328	"doc/with spaces.md"
1329	b'doc/with byte \yff.md'
1330	</pre>
1331
1332	Then you can decode it with split command sub (mentioned above):
1333
1334	var decoded = @(cat lines.txt)
1335
1336	This file has:
1337
1338	1. An unquoted string
1339	1. A JSON string with `"double quotes"`
1340	1. A J8-style string: `u'unicode'` or `b'bytes'`
1341
1342	<!--
1343	TODO: fromJ8Line() toJ8Line()
1344	-->
1345
1346	#### JSON8 is Tree-Shaped
1347
1348	JSON8 is just like JSON, but it allows J8-style strings:
1349
1350	<pre>
1351	{ "foo": "hi \uD83D\uDE42"} # valid JSON, and valid JSON8
1352	{u'foo': u'hi \u{1F642}' } # valid JSON8, with J8-style strings
1353	</pre>
1354
1355	<!--
1356	In addition to strings and lines, you can write and read tree-shaped data
1357	as [JSON][]:
1358
1359	var d = {key: 'value'}
1360	json write (d) # dump variable d as JSON
1361	# =>
1362	# {
1363	# "key": "value"
1364	# }
1365
1366	echo '["ale", 42]' > example.json
1367
1368	json read (&d2) < example.json # parse JSON into var d2
1369	pp (d2) # pretty print it
1370	# => (List) ['ale', 42]
1371
1372	[JSON][] will lose information when strings have binary data, but the slight
1373	[JSON8]($xref) upgrade won't:
1374
1375	var b = {binary: $'\xff'}
1376	json8 write (b)
1377	# =>
1378	# {
1379	# "binary": b'\yff'
1380	# }
1381	-->
1382
1383	[JSON]: $xref
1384
1385	#### TSV8 is Table-Shaped
1386
1387	(TODO: not yet implemented.)
1388
1389	YSH supports data notation for tables:
1390
1391	1. Plain TSV files, which are untyped. Every column has string data.
1392	- Cells with tabs, newlines, and binary data are a problem.
1393	2. Our extension [TSV8]($xref), which supports typed data.
1394	- It uses JSON notation for booleans, integers, and floats.
1395	- It uses J8 strings, which can represent any string.
1396
1397	<!-- Figure out the API. Does it work like JSON?
1398
1399	Or I think we just implement
1400	- rows: 'where' or 'filter' (dplyr)
1401	- cols: 'select' conflicts with shell builtin; call it 'cols'?
1402	- sort: 'sort-by' or 'arrange' (dplyr)
1403	- TSV8 <=> sqlite conversion. Are these drivers or what?
1404	- and then let you pipe output?
1405
1406	Do we also need TSV8 space2tab or something? For writing TSV8 inline.
1407
1408	More later:
1409	- MessagePack (e.g. for shared library extension modules)
1410	- msgpack read, write? I think user-defined function could be like this?
1411	- SASH: Simple and Strict HTML? For easy processing
1412	-->
1413
1414	## YSH Modules are Files
1415
1416	A module is a file of source code, like `lib/myargs.ysh`. The `use`
1417	builtin turns it into an `Obj` that can be invoked and inspected:
1418
1419	use myargs.ysh
1420
1421	myargs proc1 --flag val # module name becomes a prefix, via __invoke__
1422	var alias = myargs.proc1 # module has attributes
1423
1424	You can import specific names with the `--pick` flag:
1425
1426	use myargs.ysh --pick p2 p3
1427
1428	p2
1429	p3
1430
1431	- [Feature Index: Modules](ref/feature-index.html#Modules)
1432
1433	## The Runtime Shared by OSH and YSH
1434
1435	Although we describe OSH and YSH as different languages, they use the same
1436	interpreter under the hood.
1437
1438	This interpreter has many `shopt` booleans to control behavior, like `shopt
1439	--set parse_paren`. The group `shopt --set ysh:all` flips all booleans to make
1440	`bin/osh` behave like `bin/ysh`.
1441
1442	Understanding this common runtime, and its interface to the Unix kernel, will
1443	help you understand both languages!
1444
1445	### Interpreter Data Model
1446
1447	The [Interpreter State](interpreter-state.html) doc is under construction. It
1448	will cover:
1449
1450	- The call stack for OSH and YSH
1451	- Each stack frame is a `{name -> cell}` mapping.
1452	- Each cell has a value, with boolean flags
1453	- OSH has types `Str BashArray BashAssoc`, and flags `readonly export
1454	nameref`.
1455	- YSH has types `Bool Int Float Str List Dict Obj ...`, and the `readonly`
1456	flag.
1457	- YSH namespaces
1458	- Modules with `use`
1459	- Builtin functions and commands
1460	- ENV
1461	- Shell options
1462	- Boolean options with `shopt`: `parse_paren`, `simple_word_eval`, etc.
1463	- String options with `shvar`: `IFS`, `PATH`
1464	- Registers that store interpreter state
1465	- `$?` and `_error`
1466	- `$!` for the last PID
1467	- `_this_dir`
1468	- `_reply`
1469
1470	### Process Model (the kernel)
1471
1472	The [Process Model](process-model.html) doc is under construction. It will cover:
1473
1474	- Simple Commands, `exec`
1475	- Pipelines. #[shell-the-good-parts](#blog-tag)
1476	- `fork`, `forkwait`
1477	- Command and process substitution
1478	- Related:
1479	- [Tracing execution in Oils](xtrace.html) (xtrace), which divides
1480	process-based concurrency into synchronous and async constructs.
1481	- [Three Comics For Understanding Unix
1482	Shell](http://www.oilshell.org/blog/2020/04/comics.html) (blog)
1483
1484	<!--
1485	Process model additions: Capers, Headless shell
1486
1487	some optimizations: See YSH starts fewer processes than other shells.
1488	-->
1489
1490	### Advanced: Reflecting on the Interpreter
1491
1492	You can reflect on the interpreter with APIs like `io->eval()` and
1493	`vm.getFrame()`.
1494
1495	- [Feature Index: Reflection](ref/feature-index.html#Reflection)
1496
1497	This allows YSH to be a language for creating other languages. (Ruby, Tcl, and
1498	Racket also have this flavor.)
1499
1500	<!--
1501
1502	TODO: Hay and Awk examples
1503	-->
1504
1505	## Summary
1506
1507	What have we described in this tour?
1508
1509	YSH is a programming language that evolved from Unix shell. But you can
1510	"forget" the bad parts of shell like `[ $x -lt $y ]`.
1511
1512	<!--
1513	Instead, we've shown you shell-like commands, Python-like expressions on typed
1514	data, and Ruby-like command blocks.
1515	-->
1516
1517	Instead, focus on these central concepts:
1518
1519	1. Interleaved word, command, and expression languages.
1520	2. A standard library of builtin commands, as well as builtin functions
1521	3. Languages for data: J8 Notation, including JSON8 and TSV8
1522	4. A runtime shared by OSH and YSH
1523
1524	## Appendix
1525
1526	### Related Docs
1527
1528	- [YSH vs. Shell Idioms](idioms.html) - YSH side-by-side with shell.
1529	- [YSH Language Influences](language-influences.html) - In addition to shell,
1530	Python, and JavaScript, YSH is influenced by Ruby, Perl, Awk, PHP, and more.
1531	- [A Feel For YSH Syntax](syntax-feelings.html) - Some thoughts that may help
1532	you remember the syntax.
1533	- [YSH Language Warts](warts.html) documents syntax that may be surprising.
1534
1535
1536	### YSH Script Template
1537
1538	YSH can be used to write simple "shell scripts" or longer programs. It has
1539	procs and modules to help with the latter.
1540
1541	A module is just a file, like this:
1542
1543	```
1544	#!/usr/bin/env ysh
1545	### Deploy script
1546
1547	use $_this_dir/lib/util.ysh --pick log
1548
1549	const DEST = '/tmp/ysh-tour'
1550
1551	proc my-sync(...files) {
1552	### Sync files and show which ones
1553
1554	cp --verbose @files $DEST
1555	}
1556
1557	proc main {
1558	mkdir -p $DEST
1559
1560	touch {foo,bar}.py {build,test}.sh
1561
1562	log "Copying source files"
1563	my-sync .py .sh
1564
1565	if test --dir /tmp/logs {
1566	cd /tmp/logs
1567
1568	log "Copying logs"
1569	my-sync *.log
1570	}
1571	}
1572
1573	if is-main { # The only top-level statement
1574	main @ARGV
1575	}
1576	```
1577
1578	<!--
1579	TODO:
1580	- Also show flags parsing?
1581	- Show longer examples where it isn't boilerplate
1582	-->
1583
1584	You wouldn't bother with the boilerplate for something this small. But this
1585	example illustrates the basic idea: the top level often contains these words:
1586	`use`, `const`, `proc`, and `func`.
1587
1588
1589	<!--
1590	TODO: not mentioning __provide__, since it should be optional in the most basic usage?
1591	-->
1592
1593	### YSH Features Not Shown
1594
1595	#### Advanced
1596
1597	These shell features are part of YSH, but aren't shown above:
1598
1599	- The `fork` and `forkwait` builtins, for concurrent execution and subshells.
1600	- Process Substitution: `diff <(sort left.txt) <(sort right.txt)`
1601
1602	#### Deprecated Shell Constructs
1603
1604	The shared interpreter supports many shell constructs that are deprecated:
1605
1606	- YSH code uses shell's `\|\|` and `&&` in limited circumstances, since `errexit`
1607	is on by default.
1608	- Assignment builtins like `local` and `declare`. Use YSH keywords.
1609	- Boolean expressions like `[[ x =~ $pat ]]`. Use YSH expressions.
1610	- Shell arithmetic like `$(( x + 1 ))` and `(( y = x ))`. Use YSH expressions.
1611	- The `until` loop can always be replaced with a `while` loop
1612	- Most of what's in `${}` can be written in other ways. For example
1613	`${s#/tmp}` could be `s => removePrefix('/tmp')` (TODO).
1614
1615	#### Not Yet Implemented
1616
1617	This document mentions a few constructs that aren't yet implemented. Here's a
1618	summary:
1619
1620	```none
1621	# Unimplemented syntax:
1622
1623	echo ${x\|html} # formatters
1624
1625	echo ${x %.2f} # statically-parsed printf
1626
1627	var x = "<p>$x</p>"html
1628	echo "<p>$x</p>"html # tagged string
1629
1630	var x = 15 Mi # units suffix
1631	```
1632
1633	<!--
1634	- To implement: Capers: stateless coprocesses
1635	-->
1636