| 1 | ---
|
| 2 | default_highlighter: oils-sh
|
| 3 | ---
|
| 4 |
|
| 5 | A Tour of YSH
|
| 6 | =============
|
| 7 |
|
| 8 | <!-- author's note about example names
|
| 9 |
|
| 10 | - people: alice, bob
|
| 11 | - nouns: ale, bean
|
| 12 | - peanut, coconut
|
| 13 | - 42 for integers
|
| 14 | -->
|
| 15 |
|
| 16 | This doc describes the [YSH]($xref) language from **clean slate**
|
| 17 | perspective. We don't assume you know Unix shell, or the compatible
|
| 18 | [OSH]($xref). But shell users will see the similarity, with simplifications
|
| 19 | and upgrades.
|
| 20 |
|
| 21 | Remember, YSH is for Python and JavaScript users who avoid shell! See the
|
| 22 | [project FAQ][FAQ] for more color on that.
|
| 23 |
|
| 24 | [FAQ]: https://www.oilshell.org/blog/2021/01/why-a-new-shell.html
|
| 25 |
|
| 26 | This document is **long** because it demonstrates nearly every feature of the
|
| 27 | language. You may want to read it in multiple sittings, or read [The Simplest
|
| 28 | Explanation of
|
| 29 | Oil](https://www.oilshell.org/blog/2020/01/simplest-explanation.html) first.
|
| 30 | (Until 2023, YSH was called the "Oil language".)
|
| 31 |
|
| 32 |
|
| 33 | Here's a summary of what follows:
|
| 34 |
|
| 35 | 1. YSH has interleaved *word*, *command*, and *expression* languages.
|
| 36 | - The command language has Ruby-like *blocks*, and the expression language
|
| 37 | has Python-like *data types*.
|
| 38 | 2. YSH has both builtin *commands* like `cd /tmp`, and builtin *functions* like
|
| 39 | `join()`.
|
| 40 | 3. Languages for *data*, like [JSON][], are complementary to YSH code.
|
| 41 | 4. OSH and YSH share both an *interpreter data model* and a *process model*
|
| 42 | (provided by the Unix kernel). Understanding these common models will make
|
| 43 | you both a better shell user and YSH user.
|
| 44 |
|
| 45 | Keep these points in mind as you read the details below.
|
| 46 |
|
| 47 | [JSON]: https://json.org
|
| 48 |
|
| 49 | <div id="toc">
|
| 50 | </div>
|
| 51 |
|
| 52 | ## Preliminaries
|
| 53 |
|
| 54 | Start YSH just like you start bash or Python:
|
| 55 |
|
| 56 | <!-- oils-sh below skips code block extraction, since it doesn't run -->
|
| 57 |
|
| 58 | ```sh-prompt
|
| 59 | bash$ ysh # assuming it's installed
|
| 60 |
|
| 61 | ysh$ echo 'hello world' # command typed into YSH
|
| 62 | hello world
|
| 63 | ```
|
| 64 |
|
| 65 | In the sections below, we'll save space by showing output **in comments**, with
|
| 66 | `=>`:
|
| 67 |
|
| 68 | echo 'hello world' # => hello world
|
| 69 |
|
| 70 | Multi-line output is shown like this:
|
| 71 |
|
| 72 | echo one
|
| 73 | echo two
|
| 74 | # =>
|
| 75 | # one
|
| 76 | # two
|
| 77 |
|
| 78 | ## Examples
|
| 79 |
|
| 80 | ### Hello World Script
|
| 81 |
|
| 82 | You can also type commands into a file like `hello.ysh`. This is a complete
|
| 83 | YSH program, which is identical to a shell program:
|
| 84 |
|
| 85 | echo 'hello world' # => hello world
|
| 86 |
|
| 87 | ### A Taste of YSH
|
| 88 |
|
| 89 | Unlike shell, YSH has `var` and `const` keywords:
|
| 90 |
|
| 91 | const name = 'world' # const is rarer, used the top-level
|
| 92 | echo "hello $name" # => hello world
|
| 93 |
|
| 94 | They take rich Python-like expressions on the right:
|
| 95 |
|
| 96 | var x = 42 # an integer, not a string
|
| 97 | setvar x = x * 2 + 1 # mutate with the 'setvar' keyword
|
| 98 |
|
| 99 | setvar x += 5 # Increment by 5
|
| 100 | echo $x # => 6
|
| 101 |
|
| 102 | var mylist = [x, 7] # two integers [6, 7]
|
| 103 |
|
| 104 | Expressions are often surrounded by `()`:
|
| 105 |
|
| 106 | if (x > 0) {
|
| 107 | echo 'positive'
|
| 108 | } # => positive
|
| 109 |
|
| 110 | for i, item in (mylist) { # 'mylist' is a variable, not a string
|
| 111 | echo "[$i] item $item"
|
| 112 | }
|
| 113 | # =>
|
| 114 | # [0] item 6
|
| 115 | # [1] item 7
|
| 116 |
|
| 117 | YSH has Ruby-like blocks:
|
| 118 |
|
| 119 | cd /tmp {
|
| 120 | echo hi > greeting.txt # file created inside /tmp
|
| 121 | echo $PWD # => /tmp
|
| 122 | }
|
| 123 | echo $PWD # prints the original directory
|
| 124 |
|
| 125 | And utilities to read and write JSON:
|
| 126 |
|
| 127 | var person = {name: 'bob', age: 42}
|
| 128 | json write (person)
|
| 129 | # =>
|
| 130 | # {
|
| 131 | # "name": "bob",
|
| 132 | # "age": 42,
|
| 133 | # }
|
| 134 |
|
| 135 | echo '["str", 42]' | json read # sets '_reply' variable by default
|
| 136 |
|
| 137 | ### Tip: Use the `=` operator interactively
|
| 138 |
|
| 139 | The `=` keyword evaluates and prints an expression:
|
| 140 |
|
| 141 | = _reply
|
| 142 | # => (List) ["str", 42]
|
| 143 |
|
| 144 | (Think of it like `var x = _reply`, without the `var`.)
|
| 145 |
|
| 146 | The **best way** to learn YSH is to type these examples and see what happens!
|
| 147 |
|
| 148 | ## Word Language: Expressions for Strings (and Arrays)
|
| 149 |
|
| 150 | Let's describe the word language first, and then talk about commands and
|
| 151 | expressions. Words are a rich language because **strings** are a central
|
| 152 | concept in shell.
|
| 153 |
|
| 154 | ### Unquoted Words
|
| 155 |
|
| 156 | Words denote strings, but you often don't need to quote them:
|
| 157 |
|
| 158 | echo hi # => hi
|
| 159 |
|
| 160 | Quotes are useful when a string has spaces, or punctuation characters like `( )
|
| 161 | ;`.
|
| 162 |
|
| 163 | ### Three Kinds of String Literals
|
| 164 |
|
| 165 | You can choose the style that's most convenient to write a given string.
|
| 166 |
|
| 167 | #### Double-Quoted, Single-Quoted, and J8 strings (like JSON)
|
| 168 |
|
| 169 | Double-quoted strings allow **interpolation**, with `$`:
|
| 170 |
|
| 171 | var person = 'alice'
|
| 172 | echo "hi $person, $(echo bye)" # => hi alice, bye
|
| 173 |
|
| 174 | Write operators by escaping them with `\`:
|
| 175 |
|
| 176 | echo "\$ \" \\ " # => $ " \
|
| 177 |
|
| 178 | In single-quoted strings, all characters are **literal** (except `'`, which
|
| 179 | can't be expressed):
|
| 180 |
|
| 181 | echo 'c:\Program Files\' # => c:\Program Files\
|
| 182 |
|
| 183 | If you want C-style backslash **character escapes**, use a J8 string, which is
|
| 184 | like JSON, but with single quotes:
|
| 185 |
|
| 186 | echo u' A is \u{41} \n line two, with backslash \\'
|
| 187 | # =>
|
| 188 | # A is A
|
| 189 | # line two, with backslash \
|
| 190 |
|
| 191 | The `u''` strings are guaranteed to be valid Unicode (unlike JSON). You can
|
| 192 | also use `b''` strings:
|
| 193 |
|
| 194 | echo b'byte \yff' # Byte that's not valid unicode, like \xff in C.
|
| 195 | # Don't confuse it with \u{ff}.
|
| 196 |
|
| 197 | #### Multi-line Strings
|
| 198 |
|
| 199 | Multi-line strings are surrounded with triple quotes. They come in the same
|
| 200 | three varieties, and leading whitespace is stripped in a convenient way.
|
| 201 |
|
| 202 | sort <<< """
|
| 203 | var sub: $x
|
| 204 | command sub: $(echo hi)
|
| 205 | expression sub: $[x + 3]
|
| 206 | """
|
| 207 | # =>
|
| 208 | # command sub: hi
|
| 209 | # expression sub: 9
|
| 210 | # var sub: 6
|
| 211 |
|
| 212 | sort <<< '''
|
| 213 | $2.00 # literal $, no interpolation
|
| 214 | $1.99
|
| 215 | '''
|
| 216 | # =>
|
| 217 | # $1.99
|
| 218 | # $2.00
|
| 219 |
|
| 220 | sort <<< u'''
|
| 221 | C\tD
|
| 222 | A\tB
|
| 223 | ''' # b''' strings also supported
|
| 224 | # =>
|
| 225 | # A B
|
| 226 | # C D
|
| 227 |
|
| 228 | (Use multiline strings instead of shell's [here docs]($xref:here-doc).)
|
| 229 |
|
| 230 | ### Three Kinds of Substitution
|
| 231 |
|
| 232 | YSH has syntax for 3 types of substitution, all of which start with `$`. That
|
| 233 | is, you can convert any of these things to a **string**:
|
| 234 |
|
| 235 | 1. Variables
|
| 236 | 2. The output of commands
|
| 237 | 3. The value of expressions
|
| 238 |
|
| 239 | #### Variable Sub
|
| 240 |
|
| 241 | The syntax `$a` or `${a}` converts a variable to a string:
|
| 242 |
|
| 243 | var a = 'ale'
|
| 244 | echo $a # => ale
|
| 245 | echo _${a}_ # => _ale_
|
| 246 | echo "_ $a _" # => _ ale _
|
| 247 |
|
| 248 | The shell operator `:-` is occasionally useful in YSH:
|
| 249 |
|
| 250 | echo ${not_defined:-'default'} # => default
|
| 251 |
|
| 252 | #### Command Sub
|
| 253 |
|
| 254 | The `$(echo hi)` syntax runs a command and captures its `stdout`:
|
| 255 |
|
| 256 | echo $(hostname) # => example.com
|
| 257 | echo "_ $(hostname) _" # => _ example.com _
|
| 258 |
|
| 259 | #### Expression Sub
|
| 260 |
|
| 261 | The `$[myexpr]` syntax evaluates an expression and converts it to a string:
|
| 262 |
|
| 263 | echo $[a] # => ale
|
| 264 | echo $[1 + 2 * 3] # => 7
|
| 265 | echo "_ $[1 + 2 * 3] _" # => _ 7 _
|
| 266 |
|
| 267 | <!-- TODO: safe substitution with "$[a]"html -->
|
| 268 |
|
| 269 | ### Arrays of Strings: Globs, Brace Expansion, Splicing, and Splitting
|
| 270 |
|
| 271 | There are four constructs that evaluate to a **list of strings**, rather than a
|
| 272 | single string.
|
| 273 |
|
| 274 | #### Globs
|
| 275 |
|
| 276 | Globs like `*.py` evaluate to a list of files.
|
| 277 |
|
| 278 | touch foo.py bar.py # create the files
|
| 279 | write *.py
|
| 280 | # =>
|
| 281 | # foo.py
|
| 282 | # bar.py
|
| 283 |
|
| 284 | If no files match, it evaluates to an empty list (`[]`).
|
| 285 |
|
| 286 | #### Brace Expansion
|
| 287 |
|
| 288 | The brace expansion mini-language lets you write strings without duplication:
|
| 289 |
|
| 290 | write {alice,bob}@example.com
|
| 291 | # =>
|
| 292 | # alice@example.com
|
| 293 | # bob@example.com
|
| 294 |
|
| 295 | #### Splicing
|
| 296 |
|
| 297 | The `@` operator splices an array into a command:
|
| 298 |
|
| 299 | var myarray = :| ale bean |
|
| 300 | write S @myarray E
|
| 301 | # =>
|
| 302 | # S
|
| 303 | # ale
|
| 304 | # bean
|
| 305 | # E
|
| 306 |
|
| 307 | You also have `@[]` to splice an expression that evaluates to a list:
|
| 308 |
|
| 309 | write -- @[split('ale bean')]
|
| 310 | # =>
|
| 311 | # ale
|
| 312 | # bean
|
| 313 |
|
| 314 | Each item will be converted to a string.
|
| 315 |
|
| 316 | #### Split Command Sub / Split Builtin Sub
|
| 317 |
|
| 318 | There's also a variant of *command sub* that decodes J8 lines into a sequence
|
| 319 | of strings:
|
| 320 |
|
| 321 | write @(seq 3) # write is passed 3 args
|
| 322 | # =>
|
| 323 | # 1
|
| 324 | # 2
|
| 325 | # 3
|
| 326 |
|
| 327 | ## Command Language: I/O, Control Flow, Abstraction
|
| 328 |
|
| 329 | ### Simple Commands
|
| 330 |
|
| 331 | A simple command is a space-separated list of words. YSH looks up the first
|
| 332 | word to determine if it's a builtin command, or a user-defined `proc`.
|
| 333 |
|
| 334 | echo 'hello world' # The shell builtin 'echo'
|
| 335 |
|
| 336 | proc greet (name) { # Define a unit of code
|
| 337 | echo "hello $name"
|
| 338 | }
|
| 339 |
|
| 340 | # The first word now resolves to the proc you defined
|
| 341 | greet alice # => hello alice
|
| 342 |
|
| 343 | If it's neither, then it's assumed to be an external command:
|
| 344 |
|
| 345 | ls -l /tmp # The external 'ls' command
|
| 346 |
|
| 347 | Commands accept traditional string arguments, as well as typed arguments in
|
| 348 | parentheses:
|
| 349 |
|
| 350 | # 'write' is a string arg; 'x' is a typed expression arg
|
| 351 | json write (x)
|
| 352 |
|
| 353 | <!--
|
| 354 | Block args are a special kind of typed arg:
|
| 355 |
|
| 356 | cd /tmp {
|
| 357 | echo $PWD
|
| 358 | }
|
| 359 | -->
|
| 360 |
|
| 361 | ### Redirects
|
| 362 |
|
| 363 | You can **redirect** `stdin` and `stdout` of simple commands:
|
| 364 |
|
| 365 | echo hi > tmp.txt # write to a file
|
| 366 | sort < tmp.txt
|
| 367 |
|
| 368 | Here are the most common idioms for using `stderr` (identical to shell):
|
| 369 |
|
| 370 | ls /tmp 2>errors.txt
|
| 371 | echo 'fatal error' >&2
|
| 372 |
|
| 373 | ### ARGV and ENV
|
| 374 |
|
| 375 | At the top level, the `ARGV` list holds the arguments passed to the shell:
|
| 376 |
|
| 377 | var num_args = len(ARGV)
|
| 378 | ls /tmp @ARGV # pass shell's arguments through
|
| 379 |
|
| 380 | Inside a `proc` without declared parameters, `ARGV` holds the arguments passed
|
| 381 | to the `proc`. (Procs are explained below.)
|
| 382 |
|
| 383 | ---
|
| 384 |
|
| 385 | You can add to the environment of a new process with a *prefix binding*:
|
| 386 |
|
| 387 | PYTHONPATH=vendor ./demo.py # os.environ will have {'PYTHONPATH': 'vendor'}
|
| 388 |
|
| 389 | Under the hood, the prefix binding temporarily augments the `ENV` object, which
|
| 390 | is the current environment.
|
| 391 |
|
| 392 | You can also mutate the `ENV` object:
|
| 393 |
|
| 394 | setglobal ENV.PYTHONPATH = '.'
|
| 395 | ./demo.py # all future invocations have a different PYTHONPATH
|
| 396 | ./demo.py
|
| 397 |
|
| 398 | And get its attributes:
|
| 399 |
|
| 400 | echo $[ENV.PYTHONPATH] # => .
|
| 401 |
|
| 402 | ### Pipelines
|
| 403 |
|
| 404 | Pipelines are a powerful method manipulating data streams:
|
| 405 |
|
| 406 | ls | wc -l # count files in this directory
|
| 407 | find /bin -type f | xargs wc -l # count files in a subtree
|
| 408 |
|
| 409 | The stream may contain (lines of) text, binary data, JSON, TSV, and more.
|
| 410 | Details below.
|
| 411 |
|
| 412 | ### Multi-line Commands
|
| 413 |
|
| 414 | The `...` prefix lets you write long commands, pipelines, and `&&` chains
|
| 415 | without `\` line continuations.
|
| 416 |
|
| 417 | ... find /bin # traverse this directory and
|
| 418 | -type f -a -executable # print executable files
|
| 419 | | sort -r # reverse sort
|
| 420 | | head -n 30 # limit to 30 files
|
| 421 | ;
|
| 422 |
|
| 423 | When this mode is active:
|
| 424 |
|
| 425 | - A single newline behaves like a space
|
| 426 | - A blank line (two newlines in a row) is illegal, but a line that has only a
|
| 427 | comment is allowed. This prevents confusion if you forget the `;`
|
| 428 | terminator.
|
| 429 |
|
| 430 | ### `var`, `setvar`, `const` to Declare and Mutate
|
| 431 |
|
| 432 | Constants can't be modified:
|
| 433 |
|
| 434 | const myconst = 'mystr'
|
| 435 | # setvar myconst = 'foo' would be an error
|
| 436 |
|
| 437 | Modify variables with the `setvar` keyword:
|
| 438 |
|
| 439 | var num_beans = 12
|
| 440 | setvar num_beans = 13
|
| 441 |
|
| 442 | A more complex example:
|
| 443 |
|
| 444 | var d = {name: 'bob', age: 42} # dict literal
|
| 445 | setvar d.name = 'alice' # d.name is a synonym for d['name']
|
| 446 | echo $[d.name] # => alice
|
| 447 |
|
| 448 | That's most of what you need to know about assignments. Advanced users may
|
| 449 | want to use `setglobal` or `call myplace->setValue(42)` in certain situations.
|
| 450 |
|
| 451 | <!--
|
| 452 | var g = 1
|
| 453 | var h = 2
|
| 454 | proc demo(:out) {
|
| 455 | setglobal g = 42
|
| 456 | setref out = 43
|
| 457 | }
|
| 458 | demo :h # pass a reference to h
|
| 459 | echo "$g $h" # => 42 43
|
| 460 | -->
|
| 461 |
|
| 462 | More info: [Variable Declaration and Mutation](variables.html).
|
| 463 |
|
| 464 | ### `for` Loop
|
| 465 |
|
| 466 | #### Words
|
| 467 |
|
| 468 | Shell-style for loops iterate over **words**:
|
| 469 |
|
| 470 | for word in 'oils' $num_beans {pea,coco}nut {
|
| 471 | echo $word
|
| 472 | }
|
| 473 | # =>
|
| 474 | # oils
|
| 475 | # 13
|
| 476 | # peanut
|
| 477 | # coconut
|
| 478 |
|
| 479 | You can ask for the loop index with `i,`:
|
| 480 |
|
| 481 | for i, word in README.md *.py {
|
| 482 | echo "$i - $word"
|
| 483 | }
|
| 484 | # =>
|
| 485 | # 0 - README.md
|
| 486 | # 1 - __init__.py
|
| 487 |
|
| 488 | #### Typed Data
|
| 489 |
|
| 490 | To iterate over a typed data, use parentheses around an **expression**. The
|
| 491 | expression should evaluate to an integer `Range`, `List`, `Dict`, or `io.stdin`.
|
| 492 |
|
| 493 | Range:
|
| 494 |
|
| 495 | for i in (3 ..< 5) { # range operator ..<
|
| 496 | echo "i = $i"
|
| 497 | }
|
| 498 | # =>
|
| 499 | # i = 3
|
| 500 | # i = 4
|
| 501 |
|
| 502 | List:
|
| 503 |
|
| 504 | var foods = ['ale', 'bean']
|
| 505 | for item in (foods) {
|
| 506 | echo $item
|
| 507 | }
|
| 508 | # =>
|
| 509 | # ale
|
| 510 | # bean
|
| 511 |
|
| 512 | Again, you can request the index with `for i, item in ...`.
|
| 513 |
|
| 514 | ---
|
| 515 |
|
| 516 | There are **three** ways of iterating over a `Dict`:
|
| 517 |
|
| 518 | var mydict = {pea: 42, nut: 10}
|
| 519 | for key in (mydict) {
|
| 520 | echo $key
|
| 521 | }
|
| 522 | # =>
|
| 523 | # pea
|
| 524 | # nut
|
| 525 |
|
| 526 | for key, value in (mydict) {
|
| 527 | echo "$key $value"
|
| 528 | }
|
| 529 | # =>
|
| 530 | # pea - 42
|
| 531 | # nut - 10
|
| 532 |
|
| 533 | for i, key, value in (mydict) {
|
| 534 | echo "$i $key $value"
|
| 535 | }
|
| 536 | # =>
|
| 537 | # 0 - pea - 42
|
| 538 | # 1 - nut - 10
|
| 539 |
|
| 540 | That is, if you ask for two things, you'll get the key and value. If you ask
|
| 541 | for three, you'll also get the index.
|
| 542 |
|
| 543 | (One way to think of it: `for` loops in YSH have the functionality Python's
|
| 544 | `enumerate()`, `items()`, `keys()`, and `values()`.)
|
| 545 |
|
| 546 | ---
|
| 547 |
|
| 548 | The `io.stdin` object iterates over lines:
|
| 549 |
|
| 550 | for line in (io.stdin) {
|
| 551 | echo $line
|
| 552 | }
|
| 553 | # lines are buffered, so it's much faster than `while read --raw-line`
|
| 554 |
|
| 555 | <!--
|
| 556 | TODO: Str loop should give you the (UTF-8 offset, rune)
|
| 557 | Or maybe just UTF-8 offset? Decoding errors could be exceptions, or Unicode
|
| 558 | replacement.
|
| 559 | -->
|
| 560 |
|
| 561 | ### `while` Loop
|
| 562 |
|
| 563 | While loops can use a **command** as the termination condition:
|
| 564 |
|
| 565 | while test --file lock {
|
| 566 | sleep 1
|
| 567 | }
|
| 568 |
|
| 569 | Or an **expression**, which is surrounded in `()`:
|
| 570 |
|
| 571 | var i = 3
|
| 572 | while (i < 6) {
|
| 573 | echo "i = $i"
|
| 574 | setvar i += 1
|
| 575 | }
|
| 576 | # =>
|
| 577 | # i = 3
|
| 578 | # i = 4
|
| 579 | # i = 5
|
| 580 |
|
| 581 | ### Conditionals
|
| 582 |
|
| 583 | #### `if elif`
|
| 584 |
|
| 585 | If statements test the exit code of a command, and have optional `elif` and
|
| 586 | `else` clauses:
|
| 587 |
|
| 588 | if test --file foo {
|
| 589 | echo 'foo is a file'
|
| 590 | rm --verbose foo # delete it
|
| 591 | } elif test --dir foo {
|
| 592 | echo 'foo is a directory'
|
| 593 | } else {
|
| 594 | echo 'neither'
|
| 595 | }
|
| 596 |
|
| 597 | Invert the exit code with `!`:
|
| 598 |
|
| 599 | if ! grep alice /etc/passwd {
|
| 600 | echo 'alice is not a user'
|
| 601 | }
|
| 602 |
|
| 603 | As with `while` loops, the condition can also be an **expression** wrapped in
|
| 604 | `()`:
|
| 605 |
|
| 606 | if (num_beans > 0) {
|
| 607 | echo 'so many beans'
|
| 608 | }
|
| 609 |
|
| 610 | var done = false
|
| 611 | if (not done) { # negate with 'not' operator (contrast with !)
|
| 612 | echo "we aren't done"
|
| 613 | }
|
| 614 |
|
| 615 | #### `case`
|
| 616 |
|
| 617 | The case statement is a series of conditionals and executable blocks. The
|
| 618 | condition can be either an unquoted glob pattern like `*.py`, an eggex pattern
|
| 619 | like `/d+/`, or a typed expression like `(42)`:
|
| 620 |
|
| 621 | var s = 'README.md'
|
| 622 | case (s) {
|
| 623 | *.py { echo 'Python' }
|
| 624 | *.cc | *.h { echo 'C++' }
|
| 625 | * { echo 'Other' }
|
| 626 | }
|
| 627 | # => Other
|
| 628 |
|
| 629 | case (s) {
|
| 630 | / dot* '.md' / { echo 'Markdown' }
|
| 631 | (30 + 12) { echo 'the integer 42' }
|
| 632 | (else) { echo 'neither' }
|
| 633 | }
|
| 634 | # => Markdown
|
| 635 |
|
| 636 |
|
| 637 | <!--
|
| 638 | (Shell style like `if foo; then ... fi` and `case $x in ... esac` is also
|
| 639 | legal, but discouraged in YSH code.)
|
| 640 | -->
|
| 641 |
|
| 642 | ### Error Handling
|
| 643 |
|
| 644 | If statements are also used for **error handling**. Builtins and external
|
| 645 | commands use this style:
|
| 646 |
|
| 647 | if ! test -d /bin {
|
| 648 | echo 'not a directory'
|
| 649 | }
|
| 650 |
|
| 651 | if ! cp foo /tmp {
|
| 652 | echo 'error copying' # any non-zero status
|
| 653 | }
|
| 654 |
|
| 655 | Procs use this style (because of shell's *disabled `errexit` quirk*):
|
| 656 |
|
| 657 | try {
|
| 658 | myproc
|
| 659 | }
|
| 660 | if failed {
|
| 661 | echo 'failed'
|
| 662 | }
|
| 663 |
|
| 664 | For a complete list of examples, see [YSH Error
|
| 665 | Handling](ysh-error.html). For design goals and a reference, see [YSH
|
| 666 | Fixes Shell's Error Handling](error-handling.html).
|
| 667 |
|
| 668 | #### exit, break, continue, return
|
| 669 |
|
| 670 | The `exit` **keyword** exits a process. (It's not a shell builtin.)
|
| 671 |
|
| 672 | The other 3 control flow keywords behave like they do in Python and JavaScript.
|
| 673 |
|
| 674 | ### Shell-like `proc`
|
| 675 |
|
| 676 | You can define units of code with the `proc` keyword. A `proc` is like a
|
| 677 | *procedure* or *process*.
|
| 678 |
|
| 679 | proc my-ls {
|
| 680 | ls -a -l @ARGV # pass args through
|
| 681 | }
|
| 682 |
|
| 683 | Simple procs like this are invoked like a shell command:
|
| 684 |
|
| 685 | my-ls /dev/null /etc/passwd
|
| 686 |
|
| 687 | You can name the parameters, and add a doc comment with `###`:
|
| 688 |
|
| 689 | proc mycopy (src, dest) {
|
| 690 | ### Copy verbosely
|
| 691 |
|
| 692 | mkdir -p $dest
|
| 693 | cp --verbose $src $dest
|
| 694 | }
|
| 695 | touch log.txt
|
| 696 | mycopy log.txt /tmp # first word 'mycopy' is a proc
|
| 697 |
|
| 698 | Procs have many features, including **four** kinds of arguments:
|
| 699 |
|
| 700 | 1. Word args (which are always strings)
|
| 701 | 1. Typed, positional args
|
| 702 | 1. Typed, named args
|
| 703 | 1. A final block argument, which may be written with `{ }`.
|
| 704 |
|
| 705 | At the call site, they can look like any of these forms:
|
| 706 |
|
| 707 | ls /tmp # word arg
|
| 708 |
|
| 709 | json write (d) # word arg, then positional arg
|
| 710 |
|
| 711 | try {
|
| 712 | error 'failed' (status=9) # word arg, then named arg
|
| 713 | }
|
| 714 |
|
| 715 | cd /tmp { echo $PWD } # word arg, then block arg
|
| 716 |
|
| 717 | pp value ([1, 2]) # positional, typed arg
|
| 718 |
|
| 719 | <!-- TODO: lazy arg list: ls8 | where [age > 10] -->
|
| 720 |
|
| 721 | At the definition site, the kinds of parameters are separated with `;`, similar
|
| 722 | to the Julia language:
|
| 723 |
|
| 724 | proc p2 (word1, word2; pos1, pos2, ...rest_pos) {
|
| 725 | echo "$word1 $word2 $[pos1 + pos2]"
|
| 726 | json write (rest_pos)
|
| 727 | }
|
| 728 |
|
| 729 | proc p3 (w ; ; named1, named2, ...rest_named; block) {
|
| 730 | echo "$w $[named1 + named2]"
|
| 731 | call io->eval(block)
|
| 732 | json write (rest_named)
|
| 733 | }
|
| 734 |
|
| 735 | proc p4 (; ; ; block) {
|
| 736 | call io->eval(block)
|
| 737 | }
|
| 738 |
|
| 739 | YSH also has Python-like functions defined with `func`. These are part of the
|
| 740 | expression language, which we'll see later.
|
| 741 |
|
| 742 | For more info, see the [Guide to Procs and Funcs](proc-func.html).
|
| 743 |
|
| 744 | ### Ruby-like Block Arguments
|
| 745 |
|
| 746 | A block is a value of type `Command`. For example, `shopt` is a builtin
|
| 747 | command that takes a block argument:
|
| 748 |
|
| 749 | shopt --unset errexit { # ignore errors
|
| 750 | cp ale /tmp
|
| 751 | cp bean /bin
|
| 752 | }
|
| 753 |
|
| 754 | In this case, the block doesn't form a new scope.
|
| 755 |
|
| 756 | #### Block Scope / Closures
|
| 757 |
|
| 758 | However, by default, block arguments capture the frame they're defined in.
|
| 759 | This means they obey *lexical scope*.
|
| 760 |
|
| 761 | Consider this proc, which accepts a block, and runs it:
|
| 762 |
|
| 763 | proc do-it (; ; ; block) {
|
| 764 | call io->eval(block)
|
| 765 | }
|
| 766 |
|
| 767 | When the block arg is passed, the enclosing stack frame is captured. This
|
| 768 | means that code inside the block can use variables in the captured frame:
|
| 769 |
|
| 770 | var x = 42
|
| 771 | do-it {
|
| 772 | echo "x = $x" # outer x is visible LATER, when the block is run
|
| 773 | }
|
| 774 |
|
| 775 | - [Feature Index: Closures](ref/feature-index.html#Closures)
|
| 776 |
|
| 777 | ### Builtin Commands
|
| 778 |
|
| 779 | **Shell builtins** like `cd` and `read` are the "standard library" of the
|
| 780 | command language. Each one takes various flags:
|
| 781 |
|
| 782 | cd -L . # follow symlinks
|
| 783 |
|
| 784 | echo foo | read --all # read all of stdin
|
| 785 |
|
| 786 | Here are some categories of builtin:
|
| 787 |
|
| 788 | - I/O: `echo write read`
|
| 789 | - File system: `cd test`
|
| 790 | - Processes: `fork wait forkwait exec`
|
| 791 | - Interpreter settings: `shopt shvar`
|
| 792 | - Meta: `command builtin runproc type eval`
|
| 793 |
|
| 794 | <!-- TODO: Link to a comprehensive list of builtins -->
|
| 795 |
|
| 796 | ## Expression Language: Python-like Types
|
| 797 |
|
| 798 | YSH expressions look and behave more like Python or JavaScript than shell. For
|
| 799 | example, we write `if (x < y)` instead of `if [ $x -lt $y ]`. Expressions are
|
| 800 | usually surrounded by `( )`.
|
| 801 |
|
| 802 | At runtime, variables like `x` and `y` are bounded to **typed data**, like
|
| 803 | integers, floats, strings, lists, and dicts.
|
| 804 |
|
| 805 | <!--
|
| 806 | [Command vs. Expression Mode](command-vs-expression-mode.html) may help you
|
| 807 | understand how YSH is parsed.
|
| 808 | -->
|
| 809 |
|
| 810 | ### Python-like `func`
|
| 811 |
|
| 812 | At the end of the *Command Language*, we saw that procs are shell-like units of
|
| 813 | code. YSH also has Python-like **functions**, which are different than
|
| 814 | `procs`:
|
| 815 |
|
| 816 | - They're defined with the `func` keyword.
|
| 817 | - They're called in expressions, not in commands.
|
| 818 | - They're **pure**, and live in the **interior** of a process.
|
| 819 | - In contrast, procs usually perform I/O, and have **exterior** boundaries.
|
| 820 |
|
| 821 | The simplest function is:
|
| 822 |
|
| 823 | func identity(x) {
|
| 824 | return (x) # parens required for typed return
|
| 825 | }
|
| 826 |
|
| 827 | A more complex pure function:
|
| 828 |
|
| 829 | func myRepeat(s, n; special=false) { # positional; named params
|
| 830 | var parts = []
|
| 831 | for i in (0 ..< n) {
|
| 832 | append $s (parts)
|
| 833 | }
|
| 834 | var result = join(parts)
|
| 835 |
|
| 836 | if (special) {
|
| 837 | return ("$result !!")
|
| 838 | } else {
|
| 839 | return (result)
|
| 840 | }
|
| 841 | }
|
| 842 |
|
| 843 | echo $[myRepeat('z', 3)] # => zzz
|
| 844 |
|
| 845 | echo $[myRepeat('z', 3, special=true)] # => zzz !!
|
| 846 |
|
| 847 | A function that mutates its argument:
|
| 848 |
|
| 849 | func popTwice(mylist) {
|
| 850 | call mylist->pop()
|
| 851 | call mylist->pop()
|
| 852 | }
|
| 853 |
|
| 854 | var mylist = [3, 4]
|
| 855 |
|
| 856 | # The call keyword is an "adapter" between commands and expressions,
|
| 857 | # like the = keyword.
|
| 858 | call popTwice(mylist)
|
| 859 |
|
| 860 |
|
| 861 | Funcs are named using `camelCase`, while procs use `kebab-case`. See the
|
| 862 | [Style Guide](style-guide.html) for more conventions.
|
| 863 |
|
| 864 | #### Builtin Functions
|
| 865 |
|
| 866 | In addition, to builtin commands, YSH has Python-like builtin **functions**.
|
| 867 | These are like the "standard library" for the expression language. Examples:
|
| 868 |
|
| 869 | - Functions that take multiple types: `len() type()`
|
| 870 | - Conversions: `bool() int() float() str() list() ...`
|
| 871 | - Explicit word evaluation: `split() join() glob() maybe()`
|
| 872 |
|
| 873 | <!-- TODO: Make a comprehensive list of func builtins. -->
|
| 874 |
|
| 875 |
|
| 876 | ### Data Types: `Int`, `Str`, `List`, `Dict`, `Obj`, ...
|
| 877 |
|
| 878 | YSH has data types, each with an expression syntax and associated methods.
|
| 879 |
|
| 880 | ### Methods
|
| 881 |
|
| 882 | Non-mutating methods are looked up with the `.` operator:
|
| 883 |
|
| 884 | var line = ' ale bean '
|
| 885 | var caps = line.trim().upper() # 'ALE BEAN'
|
| 886 |
|
| 887 | Mutating methods are looked up with a thin arrow `->`:
|
| 888 |
|
| 889 | var foods = ['ale', 'bean']
|
| 890 | var last = foods->pop() # bean
|
| 891 | write @foods # => ale
|
| 892 |
|
| 893 | You can ignore the return value with the `call` keyword:
|
| 894 |
|
| 895 | call foods->pop()
|
| 896 |
|
| 897 | That is, YSH adds mutable data structures to shell, so we have a special syntax
|
| 898 | for mutation.
|
| 899 |
|
| 900 | ---
|
| 901 |
|
| 902 | You can also chain functions with a fat arrow `=>`:
|
| 903 |
|
| 904 | var trimmed = line.trim() => upper() # 'ALE BEAN'
|
| 905 |
|
| 906 | The `=>` operator allows functions to appear in a natural left-to-right order,
|
| 907 | like methods.
|
| 908 |
|
| 909 | # list() is a free function taking one arg
|
| 910 | # join() is a free function taking two args
|
| 911 | var x = {k1: 42, k2: 43} => list() => join('/') # 'K1/K2'
|
| 912 |
|
| 913 | ---
|
| 914 |
|
| 915 | Now let's go through the data types in YSH. We'll show the syntax for
|
| 916 | literals, and what **methods** they have.
|
| 917 |
|
| 918 | #### Null and Bool
|
| 919 |
|
| 920 | YSH uses JavaScript-like spellings these three "atoms":
|
| 921 |
|
| 922 | var x = null
|
| 923 |
|
| 924 | var b1, b2 = true, false
|
| 925 |
|
| 926 | if (b1) {
|
| 927 | echo 'yes'
|
| 928 | } # => yes
|
| 929 |
|
| 930 |
|
| 931 | #### Int
|
| 932 |
|
| 933 | There are many ways to write integers:
|
| 934 |
|
| 935 | var small, big = 42, 65_536
|
| 936 | echo "$small $big" # => 42 65536
|
| 937 |
|
| 938 | var hex, octal, binary = 0x0001_0000, 0o755, 0b0001_0101
|
| 939 | echo "$hex $octal $binary" # => 65536 493 21
|
| 940 |
|
| 941 | <!--
|
| 942 | "Runes" are integers that represent Unicode code points. They're not common in
|
| 943 | YSH code, but can make certain string algorithms more readable.
|
| 944 |
|
| 945 | # Pound rune literals are similar to ord('A')
|
| 946 | const a = #'A'
|
| 947 |
|
| 948 | # Backslash rune literals can appear outside of quotes
|
| 949 | const newline = \n # Remember this is an integer
|
| 950 | const backslash = \\ # ditto
|
| 951 |
|
| 952 | # Unicode rune literal is syntactic sugar for 0x3bc
|
| 953 | const mu = \u{3bc}
|
| 954 |
|
| 955 | echo "chars $a $newline $backslash $mu" # => chars 65 10 92 956
|
| 956 | -->
|
| 957 |
|
| 958 | #### Float
|
| 959 |
|
| 960 | Floats are written with a decimal point:
|
| 961 |
|
| 962 | var big = 3.14
|
| 963 |
|
| 964 | You can use scientific notation, as in Python:
|
| 965 |
|
| 966 | var small = 1.5e-10
|
| 967 |
|
| 968 | #### Str
|
| 969 |
|
| 970 | See the section above on *Three Kinds of String Literals*. It described
|
| 971 | `'single quoted'`, `"double ${quoted}"`, and `u'J8-style\n'` strings; as well
|
| 972 | as their multiline variants.
|
| 973 |
|
| 974 | Strings are UTF-8 encoded in memory, like strings in the [Go
|
| 975 | language](https://golang.org). There isn't a separate string and unicode type,
|
| 976 | as in Python.
|
| 977 |
|
| 978 | Strings are **immutable**, as in Python and JavaScript. This means they only
|
| 979 | have **transforming** methods:
|
| 980 |
|
| 981 | var x = s.trim()
|
| 982 |
|
| 983 | Other methods:
|
| 984 |
|
| 985 | - `trimLeft() trimRight()`
|
| 986 | - `trimPrefix() trimSuffix()`
|
| 987 | - `upper() lower()`
|
| 988 | - `search() leftMatch()` - pattern matching
|
| 989 | - `replace() split()`
|
| 990 |
|
| 991 | #### List (and Arrays)
|
| 992 |
|
| 993 | All lists can be expressed with Python-like literals:
|
| 994 |
|
| 995 | var foods = ['ale', 'bean', 'corn']
|
| 996 | var recursive = [1, [2, 3]]
|
| 997 |
|
| 998 | As a special case, list of strings are called **arrays**. It's often more
|
| 999 | convenient to write them with shell-like literals:
|
| 1000 |
|
| 1001 | # No quotes or commas
|
| 1002 | var foods = :| ale bean corn |
|
| 1003 |
|
| 1004 | # You can use the word language here
|
| 1005 | var other = :| foo $s *.py {alice,bob}@example.com |
|
| 1006 |
|
| 1007 | Lists are **mutable**, as in Python and JavaScript. So they mainly have
|
| 1008 | mutating methods:
|
| 1009 |
|
| 1010 | call foods->reverse()
|
| 1011 | write -- @foods
|
| 1012 | # =>
|
| 1013 | # corn
|
| 1014 | # bean
|
| 1015 | # ale
|
| 1016 |
|
| 1017 | #### Dict
|
| 1018 |
|
| 1019 | Dicts use syntax that's like JavaScript. Here's a dict literal:
|
| 1020 |
|
| 1021 | var d = {
|
| 1022 | name: 'bob', # unquoted keys are allowed
|
| 1023 | age: 42,
|
| 1024 | 'key with spaces': 'val'
|
| 1025 | }
|
| 1026 |
|
| 1027 | You can use either `[]` or `.` to retrieve a value, given a key:
|
| 1028 |
|
| 1029 | var v1 = d['name']
|
| 1030 | var v2 = d.name # shorthand for the above
|
| 1031 | var v3 = d['key with spaces'] # no shorthand for this
|
| 1032 |
|
| 1033 | (If the key doesn't exist, an error is raised.)
|
| 1034 |
|
| 1035 | You can change Dict values with the same 2 syntaxes:
|
| 1036 |
|
| 1037 | setvar d['name'] = 'other'
|
| 1038 | setvar d.name = 'fun'
|
| 1039 |
|
| 1040 | ---
|
| 1041 |
|
| 1042 | If you want to compute a key name, use an expression inside `[]`:
|
| 1043 |
|
| 1044 | var key = 'alice'
|
| 1045 | var d2 = {[key ++ '_z']: 'ZZZ'} # Computed key name
|
| 1046 | echo $[d2.alice_z] # => ZZZ
|
| 1047 |
|
| 1048 | If you omit the value, its taken from a variable of the same name:
|
| 1049 |
|
| 1050 | var d3 = {key} # like {key: key}
|
| 1051 | echo "name is $[d3.key]" # => name is alice
|
| 1052 |
|
| 1053 | More examples:
|
| 1054 |
|
| 1055 | var empty = {}
|
| 1056 | echo $[len(empty)] # => 0
|
| 1057 |
|
| 1058 | The `keys()` and `values()` methods return new `List` objects:
|
| 1059 |
|
| 1060 | var keys = keys(d2) # => alice_z
|
| 1061 | var vals = values(d3) # => alice
|
| 1062 |
|
| 1063 | #### Obj
|
| 1064 |
|
| 1065 | YSH has an `Obj` type that bundles **code** and **data**. (In contrast, JSON
|
| 1066 | messages are pure data, not objects.)
|
| 1067 |
|
| 1068 | The main purpose of objects is **polymorphism**:
|
| 1069 |
|
| 1070 | var obj = makeMyObject(42) # I don't know what it looks like inside
|
| 1071 |
|
| 1072 | echo $[obj.myMethod()] # But I can perform abstract operations
|
| 1073 |
|
| 1074 | call obj->mutatingMethod() # Mutation is considered special, with ->
|
| 1075 |
|
| 1076 | YSH objects are similar to Lua and JavaScript objects. They can be thought of
|
| 1077 | as a linked list of `Dict` instances.
|
| 1078 |
|
| 1079 | Or you can say they have a `Dict` of properties, and a recursive "prototype
|
| 1080 | chain" that is also an `Obj`.
|
| 1081 |
|
| 1082 | - [Feature Index: Objects](ref/feature-index.html#Objects)
|
| 1083 |
|
| 1084 | ### `Place` type / "out params"
|
| 1085 |
|
| 1086 | The `read` builtin can set an implicit variable `_reply`:
|
| 1087 |
|
| 1088 | whoami | read --all # sets _reply
|
| 1089 |
|
| 1090 | Or you can pass a `value.Place`, created with `&`
|
| 1091 |
|
| 1092 | var x # implicitly initialized to null
|
| 1093 | whoami | read --all (&x) # mutate this "place"
|
| 1094 | echo who=$x # => who=andy
|
| 1095 |
|
| 1096 | <!--
|
| 1097 | #### Quotation Types: value.Command (Block) and value.Expr
|
| 1098 |
|
| 1099 | These types are for reflection on YSH code. Most YSH programs won't use them
|
| 1100 | directly.
|
| 1101 |
|
| 1102 | - `Command`: an unevaluated code block.
|
| 1103 | - rarely-used literal: `^(ls | wc -l)`
|
| 1104 | - `Expr`: an unevaluated expression.
|
| 1105 | - rarely-used literal: `^[42 + a[i]]`
|
| 1106 | -->
|
| 1107 |
|
| 1108 | ### Operators
|
| 1109 |
|
| 1110 | YSH operators are generally the same as in Python:
|
| 1111 |
|
| 1112 | if (10 <= num_beans and num_beans < 20) {
|
| 1113 | echo 'enough'
|
| 1114 | } # => enough
|
| 1115 |
|
| 1116 | YSH has a few operators that aren't in Python. Equality can be approximate or
|
| 1117 | exact:
|
| 1118 |
|
| 1119 | var n = ' 42 '
|
| 1120 | if (n ~== 42) {
|
| 1121 | echo 'equal after stripping whitespace and type conversion'
|
| 1122 | } # => equal after stripping whitespace type conversion
|
| 1123 |
|
| 1124 | if (n === 42) {
|
| 1125 | echo "not reached because strings and ints aren't equal"
|
| 1126 | }
|
| 1127 |
|
| 1128 | <!-- TODO: is n === 42 a type error? -->
|
| 1129 |
|
| 1130 | Pattern matching can be done with globs (`~~` and `!~~`)
|
| 1131 |
|
| 1132 | const filename = 'foo.py'
|
| 1133 | if (filename ~~ '*.py') {
|
| 1134 | echo 'Python'
|
| 1135 | } # => Python
|
| 1136 |
|
| 1137 | if (filename !~~ '*.sh') {
|
| 1138 | echo 'not shell'
|
| 1139 | } # => not shell
|
| 1140 |
|
| 1141 | or regular expressions (`~` and `!~`). See the Eggex section below for an
|
| 1142 | example of the latter.
|
| 1143 |
|
| 1144 | Concatenation is `++` rather than `+` because it avoids confusion in the
|
| 1145 | presence of type conversion:
|
| 1146 |
|
| 1147 | var n = 42 + 1 # string plus int does implicit conversion
|
| 1148 | echo $n # => 43
|
| 1149 |
|
| 1150 | var y = 'ale ' ++ "bean $n" # concatenation
|
| 1151 | echo $y # => ale bean 43
|
| 1152 |
|
| 1153 | <!--
|
| 1154 | TODO: change example above
|
| 1155 | var n = '42' + 1 # string plus int does implicit conversion
|
| 1156 | -->
|
| 1157 |
|
| 1158 | <!--
|
| 1159 |
|
| 1160 | #### Summary of Operators
|
| 1161 |
|
| 1162 | - Arithmetic: `+ - * / // %` and `**` for exponentatiation
|
| 1163 | - `/` always yields a float, and `//` is integer division
|
| 1164 | - Bitwise: `& | ^ ~`
|
| 1165 | - Logical: `and or not`
|
| 1166 | - Comparison: `== < > <= >= in 'not in'`
|
| 1167 | - Approximate equality: `~==`
|
| 1168 | - Eggex and glob match: `~ !~ ~~ !~~`
|
| 1169 | - Ternary: `1 if x else 0`
|
| 1170 | - Index and slice: `mylist[3]` and `mylist[1:3]`
|
| 1171 | - `mydict->key` is a shortcut for `mydict['key']`
|
| 1172 | - Function calls
|
| 1173 | - free: `f(x, y)`
|
| 1174 | - transformations and chaining: `s => startWith('prefix')`
|
| 1175 | - mutating methods: `mylist->pop()`
|
| 1176 | - String and List: `++` for concatenation
|
| 1177 | - This is a separate operator because the addition operator `+` does
|
| 1178 | string-to-int conversion
|
| 1179 |
|
| 1180 | TODO: What about list comprehensions?
|
| 1181 | -->
|
| 1182 |
|
| 1183 | ### Egg Expressions (YSH Regexes)
|
| 1184 |
|
| 1185 | An *Eggex* is a YSH expression that denotes a regular expression. Eggexes
|
| 1186 | translate to POSIX ERE syntax, for use with tools like `egrep`, `awk`, and `sed
|
| 1187 | --regexp-extended` (GNU only).
|
| 1188 |
|
| 1189 | They're designed to be readable and composable. Example:
|
| 1190 |
|
| 1191 | var D = / digit{1,3} /
|
| 1192 | var ip_pattern = / D '.' D '.' D '.' D'.' /
|
| 1193 |
|
| 1194 | var z = '192.168.0.1'
|
| 1195 | if (z ~ ip_pattern) { # Use the ~ operator to match
|
| 1196 | echo "$z looks like an IP address"
|
| 1197 | } # => 192.168.0.1 looks like an IP address
|
| 1198 |
|
| 1199 | if (z !~ / '.255' %end /) {
|
| 1200 | echo "doesn't end with .255"
|
| 1201 | } # => doesn't end with .255"
|
| 1202 |
|
| 1203 | See the [Egg Expressions doc](eggex.html) for details.
|
| 1204 |
|
| 1205 | ## Interlude
|
| 1206 |
|
| 1207 | Before moving onto other YSH features, let's review what we've seen.
|
| 1208 |
|
| 1209 | ### Three Interleaved Languages
|
| 1210 |
|
| 1211 | Here are the languages we saw in the last 3 sections:
|
| 1212 |
|
| 1213 | 1. **Words** evaluate to a string, or list of strings. This includes:
|
| 1214 | - literals like `'mystr'`
|
| 1215 | - substitutions like `${x}` and `$(hostname)`
|
| 1216 | - globs like `*.sh`
|
| 1217 | 2. **Commands** are used for
|
| 1218 | - I/O: pipelines, builtins like `read`
|
| 1219 | - control flow: `if`, `for`
|
| 1220 | - abstraction: `proc`
|
| 1221 | 3. **Expressions** on typed data are borrowed from Python, with influence from
|
| 1222 | JavaScript:
|
| 1223 | - Lists: `['ale', 'bean']` or `:| ale bean |`
|
| 1224 | - Dicts: `{name: 'bob', age: 42}`
|
| 1225 | - Functions: `split('ale bean')` and `join(['pea', 'nut'])`
|
| 1226 |
|
| 1227 | ### How Do They Work Together?
|
| 1228 |
|
| 1229 | Here are two examples:
|
| 1230 |
|
| 1231 | (1) In this this *command*, there are **four** *words*. The fourth word is an
|
| 1232 | *expression sub* `$[]`.
|
| 1233 |
|
| 1234 | write hello $name $[d['age'] + 1]
|
| 1235 | # =>
|
| 1236 | # hello
|
| 1237 | # world
|
| 1238 | # 43
|
| 1239 |
|
| 1240 | (2) In this assignment, the *expression* on the right hand side of `=`
|
| 1241 | concatenates two strings. The first string is a literal, and the second is a
|
| 1242 | *command sub*.
|
| 1243 |
|
| 1244 | var food = 'ale ' ++ $(echo bean | tr a-z A-Z)
|
| 1245 | write $food # => ale BEAN
|
| 1246 |
|
| 1247 | So words, commands, and expressions are **mutually recursive**. If you're a
|
| 1248 | conceptual person, skimming [Syntactic Concepts](syntactic-concepts.html) may
|
| 1249 | help you understand this on a deeper level.
|
| 1250 |
|
| 1251 | <!--
|
| 1252 | One way to think about these sublanguages is to note that the `|` character
|
| 1253 | means something different in each context:
|
| 1254 |
|
| 1255 | - In the command language, it's the pipeline operator, as in `ls | wc -l`
|
| 1256 | - In the word language, it's only valid in a literal string like `'|'`, `"|"`,
|
| 1257 | or `\|`. (It's also used in `${x|html}`, which formats a string.)
|
| 1258 | - In the expression language, it's the bitwise OR operator, as in Python and
|
| 1259 | JavaScript.
|
| 1260 | -->
|
| 1261 |
|
| 1262 | ---
|
| 1263 |
|
| 1264 | Let's move on from talking about **code**, and talk about **data**.
|
| 1265 |
|
| 1266 | ## Data Notation / Interchange Formats
|
| 1267 |
|
| 1268 | In YSH, you can read and write data languages based on [JSON]($xref). This is
|
| 1269 | a primary way to exchange messages between Unix processes.
|
| 1270 |
|
| 1271 | Instead of being **executed**, like our command/word/expression languages,
|
| 1272 | these languages **parsed** as data structures.
|
| 1273 |
|
| 1274 | <!-- TODO: Link to slogans, fallacies, and concepts -->
|
| 1275 |
|
| 1276 | ### UTF-8
|
| 1277 |
|
| 1278 | UTF-8 is the foundation of our data notation. It's the most common Unicode
|
| 1279 | encoding, and the most consistent:
|
| 1280 |
|
| 1281 | var x = u'hello \u{1f642}' # store a UTF-8 string in memory
|
| 1282 | echo $x # send UTF-8 to stdout
|
| 1283 |
|
| 1284 | hello 🙂
|
| 1285 |
|
| 1286 | <!-- TODO: there's a runes() iterator which gives integer offsets, usable for
|
| 1287 | slicing -->
|
| 1288 |
|
| 1289 | ### JSON
|
| 1290 |
|
| 1291 | JSON messages are UTF-8 text. You can encode and decode JSON with functions
|
| 1292 | (`func` style):
|
| 1293 |
|
| 1294 | var message = toJson({x: 42}) # => (Str) '{"x": 42}'
|
| 1295 | var mydict = fromJson('{"x": 42}') # => (Dict) {x: 42}
|
| 1296 |
|
| 1297 | Or with commands (`proc` style):
|
| 1298 |
|
| 1299 | json write ({x: 42}) > foo.json # writes '{"x": 42}'
|
| 1300 |
|
| 1301 | json read (&mydict) < foo.json # create var
|
| 1302 | = mydict # => (Dict) {x: 42}
|
| 1303 |
|
| 1304 | ### J8 Notation
|
| 1305 |
|
| 1306 | But JSON isn't quite enough for a principled shell.
|
| 1307 |
|
| 1308 | - Traditional Unix tools like `grep` and `awk` operate on streams of **lines**.
|
| 1309 | In YSH, to avoid data-dependent bugs, we want a reliable way of **quoting**
|
| 1310 | lines.
|
| 1311 | - In YSH, we also want to represent **binary** data, not just text. When you
|
| 1312 | read a Unix file, it may or may not be text.
|
| 1313 |
|
| 1314 | So we borrow JSON-style strings, and create [J8 Notation][]. Slogans:
|
| 1315 |
|
| 1316 | - *Deconstructing and Augmenting JSON*
|
| 1317 | - *Fixing the JSON-Unix Mismatch*
|
| 1318 |
|
| 1319 | [J8 Notation]: $xref:j8-notation
|
| 1320 |
|
| 1321 | #### J8 Lines
|
| 1322 |
|
| 1323 | *J8 Lines* are a building block of J8 Notation. If you have a file
|
| 1324 | `lines.txt`:
|
| 1325 |
|
| 1326 | <pre>
|
| 1327 | doc/hello.md
|
| 1328 | "doc/with spaces.md"
|
| 1329 | b'doc/with byte \yff.md'
|
| 1330 | </pre>
|
| 1331 |
|
| 1332 | Then you can decode it with *split command sub* (mentioned above):
|
| 1333 |
|
| 1334 | var decoded = @(cat lines.txt)
|
| 1335 |
|
| 1336 | This file has:
|
| 1337 |
|
| 1338 | 1. An unquoted string
|
| 1339 | 1. A JSON string with `"double quotes"`
|
| 1340 | 1. A J8-style string: `u'unicode'` or `b'bytes'`
|
| 1341 |
|
| 1342 | <!--
|
| 1343 | TODO: fromJ8Line() toJ8Line()
|
| 1344 | -->
|
| 1345 |
|
| 1346 | #### JSON8 is Tree-Shaped
|
| 1347 |
|
| 1348 | JSON8 is just like JSON, but it allows J8-style strings:
|
| 1349 |
|
| 1350 | <pre>
|
| 1351 | { "foo": "hi \uD83D\uDE42"} # valid JSON, and valid JSON8
|
| 1352 | {u'foo': u'hi \u{1F642}' } # valid JSON8, with J8-style strings
|
| 1353 | </pre>
|
| 1354 |
|
| 1355 | <!--
|
| 1356 | In addition to strings and lines, you can write and read **tree-shaped** data
|
| 1357 | as [JSON][]:
|
| 1358 |
|
| 1359 | var d = {key: 'value'}
|
| 1360 | json write (d) # dump variable d as JSON
|
| 1361 | # =>
|
| 1362 | # {
|
| 1363 | # "key": "value"
|
| 1364 | # }
|
| 1365 |
|
| 1366 | echo '["ale", 42]' > example.json
|
| 1367 |
|
| 1368 | json read (&d2) < example.json # parse JSON into var d2
|
| 1369 | pp (d2) # pretty print it
|
| 1370 | # => (List) ['ale', 42]
|
| 1371 |
|
| 1372 | [JSON][] will lose information when strings have binary data, but the slight
|
| 1373 | [JSON8]($xref) upgrade won't:
|
| 1374 |
|
| 1375 | var b = {binary: $'\xff'}
|
| 1376 | json8 write (b)
|
| 1377 | # =>
|
| 1378 | # {
|
| 1379 | # "binary": b'\yff'
|
| 1380 | # }
|
| 1381 | -->
|
| 1382 |
|
| 1383 | [JSON]: $xref
|
| 1384 |
|
| 1385 | #### TSV8 is Table-Shaped
|
| 1386 |
|
| 1387 | (TODO: not yet implemented.)
|
| 1388 |
|
| 1389 | YSH supports data notation for tables:
|
| 1390 |
|
| 1391 | 1. Plain TSV files, which are untyped. Every column has string data.
|
| 1392 | - Cells with tabs, newlines, and binary data are a problem.
|
| 1393 | 2. Our extension [TSV8]($xref), which supports typed data.
|
| 1394 | - It uses JSON notation for booleans, integers, and floats.
|
| 1395 | - It uses J8 strings, which can represent any string.
|
| 1396 |
|
| 1397 | <!-- Figure out the API. Does it work like JSON?
|
| 1398 |
|
| 1399 | Or I think we just implement
|
| 1400 | - rows: 'where' or 'filter' (dplyr)
|
| 1401 | - cols: 'select' conflicts with shell builtin; call it 'cols'?
|
| 1402 | - sort: 'sort-by' or 'arrange' (dplyr)
|
| 1403 | - TSV8 <=> sqlite conversion. Are these drivers or what?
|
| 1404 | - and then let you pipe output?
|
| 1405 |
|
| 1406 | Do we also need TSV8 space2tab or something? For writing TSV8 inline.
|
| 1407 |
|
| 1408 | More later:
|
| 1409 | - MessagePack (e.g. for shared library extension modules)
|
| 1410 | - msgpack read, write? I think user-defined function could be like this?
|
| 1411 | - SASH: Simple and Strict HTML? For easy processing
|
| 1412 | -->
|
| 1413 |
|
| 1414 | ## YSH Modules are Files
|
| 1415 |
|
| 1416 | A module is a **file** of source code, like `lib/myargs.ysh`. The `use`
|
| 1417 | builtin turns it into an `Obj` that can be invoked and inspected:
|
| 1418 |
|
| 1419 | use myargs.ysh
|
| 1420 |
|
| 1421 | myargs proc1 --flag val # module name becomes a prefix, via __invoke__
|
| 1422 | var alias = myargs.proc1 # module has attributes
|
| 1423 |
|
| 1424 | You can import specific names with the `--pick` flag:
|
| 1425 |
|
| 1426 | use myargs.ysh --pick p2 p3
|
| 1427 |
|
| 1428 | p2
|
| 1429 | p3
|
| 1430 |
|
| 1431 | - [Feature Index: Modules](ref/feature-index.html#Modules)
|
| 1432 |
|
| 1433 | ## The Runtime Shared by OSH and YSH
|
| 1434 |
|
| 1435 | Although we describe OSH and YSH as different languages, they use the **same**
|
| 1436 | interpreter under the hood.
|
| 1437 |
|
| 1438 | This interpreter has many `shopt` booleans to control behavior, like `shopt
|
| 1439 | --set parse_paren`. The group `shopt --set ysh:all` flips all booleans to make
|
| 1440 | `bin/osh` behave like `bin/ysh`.
|
| 1441 |
|
| 1442 | Understanding this common runtime, and its interface to the Unix kernel, will
|
| 1443 | help you understand **both** languages!
|
| 1444 |
|
| 1445 | ### Interpreter Data Model
|
| 1446 |
|
| 1447 | The [Interpreter State](interpreter-state.html) doc is under construction. It
|
| 1448 | will cover:
|
| 1449 |
|
| 1450 | - The **call stack** for OSH and YSH
|
| 1451 | - Each *stack frame* is a `{name -> cell}` mapping.
|
| 1452 | - Each cell has a **value**, with boolean flags
|
| 1453 | - OSH has types `Str BashArray BashAssoc`, and flags `readonly export
|
| 1454 | nameref`.
|
| 1455 | - YSH has types `Bool Int Float Str List Dict Obj ...`, and the `readonly`
|
| 1456 | flag.
|
| 1457 | - YSH **namespaces**
|
| 1458 | - Modules with `use`
|
| 1459 | - Builtin functions and commands
|
| 1460 | - ENV
|
| 1461 | - Shell **options**
|
| 1462 | - Boolean options with `shopt`: `parse_paren`, `simple_word_eval`, etc.
|
| 1463 | - String options with `shvar`: `IFS`, `PATH`
|
| 1464 | - **Registers** that store interpreter state
|
| 1465 | - `$?` and `_error`
|
| 1466 | - `$!` for the last PID
|
| 1467 | - `_this_dir`
|
| 1468 | - `_reply`
|
| 1469 |
|
| 1470 | ### Process Model (the kernel)
|
| 1471 |
|
| 1472 | The [Process Model](process-model.html) doc is **under construction**. It will cover:
|
| 1473 |
|
| 1474 | - Simple Commands, `exec`
|
| 1475 | - Pipelines. #[shell-the-good-parts](#blog-tag)
|
| 1476 | - `fork`, `forkwait`
|
| 1477 | - Command and process substitution
|
| 1478 | - Related:
|
| 1479 | - [Tracing execution in Oils](xtrace.html) (xtrace), which divides
|
| 1480 | process-based concurrency into **synchronous** and **async** constructs.
|
| 1481 | - [Three Comics For Understanding Unix
|
| 1482 | Shell](http://www.oilshell.org/blog/2020/04/comics.html) (blog)
|
| 1483 |
|
| 1484 | <!--
|
| 1485 | Process model additions: Capers, Headless shell
|
| 1486 |
|
| 1487 | some optimizations: See YSH starts fewer processes than other shells.
|
| 1488 | -->
|
| 1489 |
|
| 1490 | ### Advanced: Reflecting on the Interpreter
|
| 1491 |
|
| 1492 | You can reflect on the interpreter with APIs like `io->eval()` and
|
| 1493 | `vm.getFrame()`.
|
| 1494 |
|
| 1495 | - [Feature Index: Reflection](ref/feature-index.html#Reflection)
|
| 1496 |
|
| 1497 | This allows YSH to be a language for creating other languages. (Ruby, Tcl, and
|
| 1498 | Racket also have this flavor.)
|
| 1499 |
|
| 1500 | <!--
|
| 1501 |
|
| 1502 | TODO: Hay and Awk examples
|
| 1503 | -->
|
| 1504 |
|
| 1505 | ## Summary
|
| 1506 |
|
| 1507 | What have we described in this tour?
|
| 1508 |
|
| 1509 | YSH is a programming language that evolved from Unix shell. But you can
|
| 1510 | "forget" the bad parts of shell like `[ $x -lt $y ]`.
|
| 1511 |
|
| 1512 | <!--
|
| 1513 | Instead, we've shown you shell-like commands, Python-like expressions on typed
|
| 1514 | data, and Ruby-like command blocks.
|
| 1515 | -->
|
| 1516 |
|
| 1517 | Instead, focus on these central concepts:
|
| 1518 |
|
| 1519 | 1. Interleaved *word*, *command*, and *expression* languages.
|
| 1520 | 2. A standard library of *builtin commands*, as well as *builtin functions*
|
| 1521 | 3. Languages for *data*: J8 Notation, including JSON8 and TSV8
|
| 1522 | 4. A *runtime* shared by OSH and YSH
|
| 1523 |
|
| 1524 | ## Appendix
|
| 1525 |
|
| 1526 | ### Related Docs
|
| 1527 |
|
| 1528 | - [YSH vs. Shell Idioms](idioms.html) - YSH side-by-side with shell.
|
| 1529 | - [YSH Language Influences](language-influences.html) - In addition to shell,
|
| 1530 | Python, and JavaScript, YSH is influenced by Ruby, Perl, Awk, PHP, and more.
|
| 1531 | - [A Feel For YSH Syntax](syntax-feelings.html) - Some thoughts that may help
|
| 1532 | you remember the syntax.
|
| 1533 | - [YSH Language Warts](warts.html) documents syntax that may be surprising.
|
| 1534 |
|
| 1535 |
|
| 1536 | ### YSH Script Template
|
| 1537 |
|
| 1538 | YSH can be used to write simple "shell scripts" or longer programs. It has
|
| 1539 | *procs* and *modules* to help with the latter.
|
| 1540 |
|
| 1541 | A module is just a file, like this:
|
| 1542 |
|
| 1543 | ```
|
| 1544 | #!/usr/bin/env ysh
|
| 1545 | ### Deploy script
|
| 1546 |
|
| 1547 | use $_this_dir/lib/util.ysh --pick log
|
| 1548 |
|
| 1549 | const DEST = '/tmp/ysh-tour'
|
| 1550 |
|
| 1551 | proc my-sync(...files) {
|
| 1552 | ### Sync files and show which ones
|
| 1553 |
|
| 1554 | cp --verbose @files $DEST
|
| 1555 | }
|
| 1556 |
|
| 1557 | proc main {
|
| 1558 | mkdir -p $DEST
|
| 1559 |
|
| 1560 | touch {foo,bar}.py {build,test}.sh
|
| 1561 |
|
| 1562 | log "Copying source files"
|
| 1563 | my-sync *.py *.sh
|
| 1564 |
|
| 1565 | if test --dir /tmp/logs {
|
| 1566 | cd /tmp/logs
|
| 1567 |
|
| 1568 | log "Copying logs"
|
| 1569 | my-sync *.log
|
| 1570 | }
|
| 1571 | }
|
| 1572 |
|
| 1573 | if is-main { # The only top-level statement
|
| 1574 | main @ARGV
|
| 1575 | }
|
| 1576 | ```
|
| 1577 |
|
| 1578 | <!--
|
| 1579 | TODO:
|
| 1580 | - Also show flags parsing?
|
| 1581 | - Show longer examples where it isn't boilerplate
|
| 1582 | -->
|
| 1583 |
|
| 1584 | You wouldn't bother with the boilerplate for something this small. But this
|
| 1585 | example illustrates the basic idea: the top level often contains these words:
|
| 1586 | `use`, `const`, `proc`, and `func`.
|
| 1587 |
|
| 1588 |
|
| 1589 | <!--
|
| 1590 | TODO: not mentioning __provide__, since it should be optional in the most basic usage?
|
| 1591 | -->
|
| 1592 |
|
| 1593 | ### YSH Features Not Shown
|
| 1594 |
|
| 1595 | #### Advanced
|
| 1596 |
|
| 1597 | These shell features are part of YSH, but aren't shown above:
|
| 1598 |
|
| 1599 | - The `fork` and `forkwait` builtins, for concurrent execution and subshells.
|
| 1600 | - Process Substitution: `diff <(sort left.txt) <(sort right.txt)`
|
| 1601 |
|
| 1602 | #### Deprecated Shell Constructs
|
| 1603 |
|
| 1604 | The shared interpreter supports many shell constructs that are deprecated:
|
| 1605 |
|
| 1606 | - YSH code uses shell's `||` and `&&` in limited circumstances, since `errexit`
|
| 1607 | is on by default.
|
| 1608 | - Assignment builtins like `local` and `declare`. Use YSH keywords.
|
| 1609 | - Boolean expressions like `[[ x =~ $pat ]]`. Use YSH expressions.
|
| 1610 | - Shell arithmetic like `$(( x + 1 ))` and `(( y = x ))`. Use YSH expressions.
|
| 1611 | - The `until` loop can always be replaced with a `while` loop
|
| 1612 | - Most of what's in `${}` can be written in other ways. For example
|
| 1613 | `${s#/tmp}` could be `s => removePrefix('/tmp')` (TODO).
|
| 1614 |
|
| 1615 | #### Not Yet Implemented
|
| 1616 |
|
| 1617 | This document mentions a few constructs that aren't yet implemented. Here's a
|
| 1618 | summary:
|
| 1619 |
|
| 1620 | ```none
|
| 1621 | # Unimplemented syntax:
|
| 1622 |
|
| 1623 | echo ${x|html} # formatters
|
| 1624 |
|
| 1625 | echo ${x %.2f} # statically-parsed printf
|
| 1626 |
|
| 1627 | var x = "<p>$x</p>"html
|
| 1628 | echo "<p>$x</p>"html # tagged string
|
| 1629 |
|
| 1630 | var x = 15 Mi # units suffix
|
| 1631 | ```
|
| 1632 |
|
| 1633 | <!--
|
| 1634 | - To implement: Capers: stateless coprocesses
|
| 1635 | -->
|
| 1636 |
|