| 1 | ul-table: Markdown Tables Without New Syntax
|
| 2 | ================================
|
| 3 |
|
| 4 | `ul-table` is an HTML processor that lets you write **tables** as bulleted
|
| 5 | **lists**, in Markdown.
|
| 6 |
|
| 7 | It's a short program I wrote because I got tired of reading and writing `<tr>`
|
| 8 | and `<td>` and `</td>` and `</tr>`. And I got tired of aligning numbers by
|
| 9 | writing `<td class="num">` for every cell.
|
| 10 |
|
| 11 | <div id="toc">
|
| 12 | </div>
|
| 13 |
|
| 14 | ## Simple Example
|
| 15 |
|
| 16 | Let's see how it works. How do you make this table?
|
| 17 |
|
| 18 | <style>
|
| 19 | table {
|
| 20 | margin: 0 auto;
|
| 21 | }
|
| 22 | td {
|
| 23 | padding-left: 1em;
|
| 24 | padding-right: 1em;
|
| 25 | }
|
| 26 | </style>
|
| 27 |
|
| 28 | <table>
|
| 29 |
|
| 30 | - thead
|
| 31 | - Shell
|
| 32 | - Version
|
| 33 | - tr
|
| 34 | - [bash](https://www.gnu.org/software/bash/)
|
| 35 | - 5.2
|
| 36 | - tr
|
| 37 | - [OSH](https://oils.pub/)
|
| 38 | - 0.25.0
|
| 39 |
|
| 40 | </table>
|
| 41 |
|
| 42 | With `ul-table`, you create a **two-level** Markdown list, inside `<table>`
|
| 43 | tags:
|
| 44 |
|
| 45 | <!-- TODO: Add pygments highlighting -->
|
| 46 |
|
| 47 | ```
|
| 48 | <table>
|
| 49 |
|
| 50 | - thead
|
| 51 | - Shell
|
| 52 | - Version
|
| 53 | - tr
|
| 54 | - [bash](https://www.gnu.org/software/bash/)
|
| 55 | - 5.2
|
| 56 | - tr
|
| 57 | - [OSH](https://oils.pub/)
|
| 58 | - 0.25.0
|
| 59 |
|
| 60 | </table>
|
| 61 | ```
|
| 62 |
|
| 63 | The header and data rows are at the top level, and the cells are indented under
|
| 64 | them.
|
| 65 |
|
| 66 | ---
|
| 67 |
|
| 68 | The conversion takes **2 steps**: it's Markdown → HTML → HTML.
|
| 69 |
|
| 70 | First, any Markdown processor will produce this list structure, with `<ul>` and
|
| 71 | `<li>`:
|
| 72 |
|
| 73 | - thead
|
| 74 | - Shell
|
| 75 | - Version
|
| 76 | - tr
|
| 77 | - [bash](https://www.gnu.org/software/bash/)
|
| 78 | - 5.2
|
| 79 | - tr
|
| 80 | - [OSH](https://oils.pub/)
|
| 81 | - 0.25.0
|
| 82 |
|
| 83 | Second, **our** `ul-table` plugin parses and transforms that into a table, with
|
| 84 | `<tr>` and `<td>`:
|
| 85 |
|
| 86 | <table>
|
| 87 |
|
| 88 | - thead
|
| 89 | - Shell
|
| 90 | - Version
|
| 91 | - tr
|
| 92 | - [bash](https://www.gnu.org/software/bash/)
|
| 93 | - 5.2
|
| 94 | - tr
|
| 95 | - [OSH](https://oils.pub/)
|
| 96 | - 0.25.0
|
| 97 |
|
| 98 | </table>
|
| 99 |
|
| 100 | So `ul-table` is an HTML processor, **not** a Markdown processor. But it's
|
| 101 | meant to be used with Markdown.
|
| 102 |
|
| 103 | ## Design
|
| 104 |
|
| 105 | ### Goals
|
| 106 |
|
| 107 | <!--
|
| 108 | This means your docs are still readable without it, e.g. on sourcehut or
|
| 109 | Github. It degrades gracefully.
|
| 110 | -->
|
| 111 |
|
| 112 | - Don't invent any new syntax.
|
| 113 | - It reuses your knowledge of Markdown — e.g. hyperlinks.
|
| 114 | - It reuses your knowledge of HTML — e.g. attributes on tags.
|
| 115 | - Large, complex tables should be maintainable.
|
| 116 | - The user should have the **full** power of HTML. We don't hide it under
|
| 117 | another language, like MediaWiki does.
|
| 118 | - Degrade gracefully. Because it's just Markdown, you **won't break** docs by
|
| 119 | adding it.
|
| 120 | - The intermediate list form is what sourcehut or Github will show.
|
| 121 |
|
| 122 | ### Comparison
|
| 123 |
|
| 124 | Compared to other table markup formats, `ul-table` is shorter, less noisy, and
|
| 125 | easier to edit:
|
| 126 |
|
| 127 | - [ul-table Comparison: Github, Wikipedia, reStructuredText, AsciiDoc](ul-table-compare.html)
|
| 128 |
|
| 129 | ## Details
|
| 130 |
|
| 131 | ### ul-table "Grammar"
|
| 132 |
|
| 133 | Recall that a `ul-table` is a **two-level Markdown list**, between `<table>`
|
| 134 | tags. The top level list contains either:
|
| 135 |
|
| 136 | <table>
|
| 137 |
|
| 138 | - tr
|
| 139 | - `thead`
|
| 140 | - zero or one, at the beginning
|
| 141 | - tr
|
| 142 | - `tr`
|
| 143 | - zero or more, after `thead`
|
| 144 |
|
| 145 | </table>
|
| 146 |
|
| 147 | The second level contains the contents of cells, but you **don't** write `td`
|
| 148 | or `<td>`.
|
| 149 |
|
| 150 | ### Stylesheet
|
| 151 |
|
| 152 | To make the table look nice, I add a `<style>` tag, inside Markdown:
|
| 153 |
|
| 154 | <style>
|
| 155 | table {
|
| 156 | margin: 0 auto;
|
| 157 | }
|
| 158 | td {
|
| 159 | padding-left: 1em;
|
| 160 | padding-right: 1em;
|
| 161 | }
|
| 162 | </style>
|
| 163 |
|
| 164 | ## Adding HTML Attributes
|
| 165 |
|
| 166 | HTML attributes like `<tr class=foo>` and `<td id=bar>` let you format and
|
| 167 | style your table.
|
| 168 |
|
| 169 | You can add attributes to cells, columns, and rows.
|
| 170 |
|
| 171 | ### Cells
|
| 172 |
|
| 173 | <style>
|
| 174 | .hi { background-color: thistle }
|
| 175 | </style>
|
| 176 |
|
| 177 | <table>
|
| 178 |
|
| 179 | - thead
|
| 180 | - Name
|
| 181 | - Age
|
| 182 | - tr
|
| 183 | - Alice
|
| 184 | - 42 <cell-attrs class=hi />
|
| 185 | - tr
|
| 186 | - Bob
|
| 187 | - 9
|
| 188 |
|
| 189 | </table>
|
| 190 |
|
| 191 | Add cell attributes with a `cell-attrs` tag after the cell contents:
|
| 192 |
|
| 193 | ```
|
| 194 | - thead
|
| 195 | - Name
|
| 196 | - Age
|
| 197 | - tr
|
| 198 | - Alice
|
| 199 | - 42 <cell-attrs class=hi />
|
| 200 | - tr
|
| 201 | - Bob
|
| 202 | - 9
|
| 203 | ```
|
| 204 |
|
| 205 | You must use a **self-closing** tag:
|
| 206 |
|
| 207 | <cell-attrs /> # Yes
|
| 208 | <cell-attrs> # No: this is an opening tag
|
| 209 |
|
| 210 | Notice that `ul-table` takes the attributes from the `<cell-attrs />` tag, and
|
| 211 | puts it on the generated `<td>` tag.
|
| 212 |
|
| 213 | ### Columns
|
| 214 |
|
| 215 | <style>
|
| 216 | .num {
|
| 217 | text-align: right;
|
| 218 | }
|
| 219 | </style>
|
| 220 |
|
| 221 | <table>
|
| 222 |
|
| 223 | - thead
|
| 224 | - Name
|
| 225 | - Age <cell-attrs class=num />
|
| 226 | - tr
|
| 227 | - Alice
|
| 228 | - 42
|
| 229 | - tr
|
| 230 | - Bob
|
| 231 | - 9
|
| 232 |
|
| 233 | </table>
|
| 234 |
|
| 235 | To add attributes to **every cell in a column**, put `<cell-attrs />` in the
|
| 236 | `thead` section:
|
| 237 |
|
| 238 | <style>
|
| 239 | .num {
|
| 240 | background-color: bisque;
|
| 241 | align: right;
|
| 242 | }
|
| 243 | </style>
|
| 244 |
|
| 245 | ```
|
| 246 | - thead
|
| 247 | - Name
|
| 248 | - Age <cell-attrs class=num />
|
| 249 | - tr
|
| 250 | - Alice
|
| 251 | - 42 <!-- this cell gets class=num -->
|
| 252 | - tr
|
| 253 | - Bob
|
| 254 | - 9 <!-- this cells gets class=num -->
|
| 255 | ```
|
| 256 |
|
| 257 | Then every `<td>` in the column will "inherit" those attributes. This is
|
| 258 | useful for aligning numbers to the right:
|
| 259 |
|
| 260 | <style>
|
| 261 | .num {
|
| 262 | align: right;
|
| 263 | }
|
| 264 | </style>
|
| 265 |
|
| 266 | If the same attribute appears in a column in both `thead` and `tr`, the values
|
| 267 | are **concatenated**, with a space. Example:
|
| 268 |
|
| 269 | <td class="from-thead from-tr">
|
| 270 |
|
| 271 | ### Rows
|
| 272 |
|
| 273 | <style>
|
| 274 | .special-row {
|
| 275 | background-color: powderblue;
|
| 276 | }
|
| 277 | </style>
|
| 278 |
|
| 279 | <table>
|
| 280 |
|
| 281 | - thead
|
| 282 | - Name
|
| 283 | - Age
|
| 284 | - tr
|
| 285 | - Alice
|
| 286 | - 42
|
| 287 | - tr <row-attrs class="special-row "/>
|
| 288 | - Bob
|
| 289 | - 9
|
| 290 |
|
| 291 | </table>
|
| 292 |
|
| 293 | To add row attributes, put `<row-attrs />` after the `- tr`:
|
| 294 |
|
| 295 | - thead
|
| 296 | - Name
|
| 297 | - Age
|
| 298 | - tr
|
| 299 | - Alice
|
| 300 | - 42
|
| 301 | - tr <row-attrs class="special-row" />
|
| 302 | - Bob
|
| 303 | - 9
|
| 304 |
|
| 305 | ## More Complex Example
|
| 306 |
|
| 307 | This example uses more features, like Markdown and HTML inside cells. You may
|
| 308 | want to view the source text for this table: [doc/ul-table.md]($oils-src).
|
| 309 |
|
| 310 | [bash]: $xref
|
| 311 |
|
| 312 | <table id="foo">
|
| 313 |
|
| 314 | - thead
|
| 315 | - Shell
|
| 316 | - Version
|
| 317 | - Example Code
|
| 318 | - tr
|
| 319 | - [bash][]
|
| 320 | - 5.2
|
| 321 | - ```
|
| 322 | echo sh=$bash
|
| 323 | ls /tmp | wc -l
|
| 324 | echo
|
| 325 | ```
|
| 326 | - tr
|
| 327 | - [dash]($xref)
|
| 328 | - 1.5
|
| 329 | - <em>Inline HTML</em>
|
| 330 | - tr
|
| 331 | - [mksh]($xref)
|
| 332 | - 4.0
|
| 333 | - <table>
|
| 334 | <tr>
|
| 335 | <td>HTML table</td>
|
| 336 | <td>inside</td>
|
| 337 | </tr>
|
| 338 | <tr>
|
| 339 | <td>this table</td>
|
| 340 | <td>no way to re-enter inline markdown though?</td>
|
| 341 | </tr>
|
| 342 | </table>
|
| 343 | - tr
|
| 344 | - [zsh]($xref)
|
| 345 | - 3.6
|
| 346 | - Unordered List
|
| 347 | - one
|
| 348 | - two
|
| 349 | - tr
|
| 350 | - [yash]($xref)
|
| 351 | - 1.0
|
| 352 | - Ordered List
|
| 353 | 1. one
|
| 354 | 1. two
|
| 355 | - tr
|
| 356 | - [ksh]($xref)
|
| 357 | - This is
|
| 358 | paragraph one.
|
| 359 |
|
| 360 | This is
|
| 361 | paragraph two
|
| 362 | - Another cell with ...
|
| 363 |
|
| 364 | ... multiple paragraphs.
|
| 365 |
|
| 366 | </table>
|
| 367 |
|
| 368 |
|
| 369 |
|
| 370 | Another table:
|
| 371 |
|
| 372 | <style>
|
| 373 | .osh-code { color: darkred }
|
| 374 | .ysh-code { color: darkblue }
|
| 375 | </style>
|
| 376 |
|
| 377 |
|
| 378 | <table>
|
| 379 |
|
| 380 | - thead
|
| 381 | - OSH
|
| 382 | - YSH
|
| 383 | - tr
|
| 384 | - ```
|
| 385 | my-copy() {
|
| 386 | cp --verbose "$@"
|
| 387 | }
|
| 388 | ```
|
| 389 | <cell-attrs class=osh-code />
|
| 390 | - ```
|
| 391 | proc my-copy {
|
| 392 | cp --verbose @ARGV
|
| 393 | }
|
| 394 | ```
|
| 395 | <cell-attrs class=ysh-code />
|
| 396 | - tr
|
| 397 | - x
|
| 398 | - y
|
| 399 |
|
| 400 | </table>
|
| 401 |
|
| 402 |
|
| 403 | ## Markdown Quirks
|
| 404 |
|
| 405 | Here are some quirks I ran into when using `ul-table`.
|
| 406 |
|
| 407 | (1) CommonMark doesn't allow empty list items:
|
| 408 |
|
| 409 | - thead
|
| 410 | -
|
| 411 | - above is not rendered as a list item
|
| 412 |
|
| 413 | You can work around this by using a comment, or invisible character:
|
| 414 |
|
| 415 | - tr
|
| 416 | - <!-- empty -->
|
| 417 | - above is OK
|
| 418 | - tr
|
| 419 | -
|
| 420 | - also OK
|
| 421 |
|
| 422 | - [Related CommonMark thread](https://talk.commonmark.org/t/clarify-following-empty-list-items-in-0-31-2/4599)
|
| 423 |
|
| 424 | (2) Similarly, a cell with a literal hyphen may need a comment or space in
|
| 425 | front of it:
|
| 426 |
|
| 427 | - tr
|
| 428 | - <!-- hyphen --> -
|
| 429 | - -
|
| 430 |
|
| 431 | ## Conclusion
|
| 432 |
|
| 433 | `ul-table` is a nice way of writing and maintaining HTML tables. The appendix
|
| 434 | has links and details.
|
| 435 |
|
| 436 | ### Related Docs
|
| 437 |
|
| 438 | - [ul-table Comparison: Github, Wikipedia, reStructuredText, AsciiDoc](ul-table-compare.html)
|
| 439 | - [How We Build Oils Documentation](doc-toolchain.html)
|
| 440 | - [Examples of HTML Plugins](doc-plugins.html)
|
| 441 |
|
| 442 | ## Appendix: Implemention
|
| 443 |
|
| 444 | - [doctools/ul_table.py]($oils-src) - about 500 lines
|
| 445 | - [lazylex/html.py]($oils-src) - about 500 lines
|
| 446 |
|
| 447 | ### Notes on the Algorithm
|
| 448 |
|
| 449 | - lazy lexing
|
| 450 | - recursive descent parser
|
| 451 | - TODO: show grammar
|
| 452 |
|
| 453 | TODO: I would like someone to produce a **DOM**-based implementation!
|
| 454 |
|
| 455 | Our implementation is pretty low-level. It's meant to avoid the "big load
|
| 456 | anti-pattern" (allocating too much), so it's a necessarily more verbose.
|
| 457 |
|
| 458 | A DOM-based implementation should be much less than 1000 lines.
|
| 459 |
|
| 460 | ## Appendix: Real Examples
|
| 461 |
|
| 462 | - Docs
|
| 463 | - [Guide to Procs and Funcs](proc-func.html) has a big `ul-table`.
|
| 464 | Source: [doc/proc-func.md]($oils-src)
|
| 465 | - Oils Reference
|
| 466 | - [Chapter: Word Language](ref/chap-word-lang.html#op-format) has
|
| 467 | tables to document `${x@a}`
|
| 468 | - Site
|
| 469 | - [oils.pub Home Page](/)
|
| 470 | - [Blog Index](/blog/)
|
| 471 |
|
| 472 | I converted the tables in these September posts to `ul-table`:
|
| 473 |
|
| 474 | - [What Oils Looks Like in 2024](https://www.oilshell.org/blog/2024/09/project-overview.html)
|
| 475 | - [After 8 Years, Oils Is Still Small and Flexible](https://www.oilshell.org/blog/2024/09/line-counts.html)
|
| 476 | - [Garbage Collection Makes YSH Different](https://www.oilshell.org/blog/2024/09/gc.html)
|
| 477 | - [A Retrospective on the Oils Project](https://www.oilshell.org/blog/2024/09/retrospective.html)
|
| 478 | - [Oils 0.22.0 Announcement](https://www.oilshell.org/blog/2024/06/release-0.22.0.html#data-languages) - table of multi-line string literals
|
| 479 |
|
| 480 | The markup was much shorter and simpler after conversion!
|
| 481 |
|
| 482 | TODO:
|
| 483 |
|
| 484 | - More tables to Make
|
| 485 | - Interior/Exterior
|
| 486 | - Narrow Waist
|
| 487 | - Wiki pages could use conversion
|
| 488 | - [Alternative Shells]($wiki)
|
| 489 | - [Alternative Regex Syntax]($wiki)
|
| 490 | - [Survey of Config Languages]($wiki)
|
| 491 | - [Polyglot Language Understanding]($wiki)
|
| 492 | - [The Biggest Shell Programs in the World]($wiki)
|
| 493 |
|
| 494 | ## HTML Quirks
|
| 495 |
|
| 496 | - `<th>` is like `<td>`, but it belongs in `<thead><tr>`. Browsers make it
|
| 497 | bold and centered.
|
| 498 | - `<colgroup>` and `<col>` often do do what I want.
|
| 499 | - As mentioned above, you can't put `class=` columns and align them to the
|
| 500 | right or left. You have to put `class=` on *every* `<td>` cell instead.
|
| 501 |
|
| 502 | <!--
|
| 503 |
|
| 504 | ### FAQ
|
| 505 |
|
| 506 | (1) Why do row with attributes look like `tr <row-attrs />`? The first `tr`
|
| 507 | doesn't seem neecssary.
|
| 508 |
|
| 509 | This is because of the CommonMark quirk above: a list item without **text** is
|
| 510 | treated as **empty**. So we require the extra `tr` text.
|
| 511 |
|
| 512 | It's also consistent with plain rows, without attributes.
|
| 513 |
|
| 514 | -->
|
| 515 |
|
| 516 | ## Ideas for Features
|
| 517 |
|
| 518 | - Support `tfoot`?
|
| 519 | - Emit `tbody`?
|
| 520 |
|
| 521 | ---
|
| 522 |
|
| 523 | We could help users edit well-formed tables with enforced column names:
|
| 524 |
|
| 525 | - thead
|
| 526 | - <cell-attrs ult-name=name /> Name
|
| 527 | - <cell-attrs ult-name=age /> Age
|
| 528 | - tr
|
| 529 | - <cell-attrs ult-name=name /> Hi
|
| 530 | - <cell-attrs ult-name=age /> 5
|
| 531 |
|
| 532 | This is a bit verbose, but may be worth it for large tables.
|
| 533 |
|
| 534 | Less verbose syntax idea:
|
| 535 |
|
| 536 | - thead
|
| 537 | - <ult col=NAME /> <cell-attrs class=foo /> Name
|
| 538 | - <ult col=AGE /> Age
|
| 539 | - tr
|
| 540 | - <ult col=NAME /> Hi
|
| 541 | - <ult col=AGE /> 5
|
| 542 |
|
| 543 | Even less verbose:
|
| 544 |
|
| 545 | - thead
|
| 546 | - {NAME} Name
|
| 547 | - {AGE} Age
|
| 548 | - tr
|
| 549 | - {NAME} Hi
|
| 550 | - {AGE} 5
|
| 551 |
|
| 552 | The obvious problem is that we might want the literal text `{NAME}` in the
|
| 553 | header. It's unlikely, but possible.
|
| 554 |
|
| 555 |
|
| 556 | <!--
|
| 557 |
|
| 558 | TODO: We should detect cell-attrs before the closing `</li>`, or in any
|
| 559 | position?
|
| 560 |
|
| 561 | <table>
|
| 562 |
|
| 563 | - thead
|
| 564 | - OSH
|
| 565 | - YSH
|
| 566 | - tr
|
| 567 | - ```
|
| 568 | my-copy() {
|
| 569 | cp --verbose "$@"
|
| 570 | }
|
| 571 | ```
|
| 572 | <cell-attrs class=osh-code />
|
| 573 | - ```
|
| 574 | proc my-copy {
|
| 575 | cp --verbose @ARGV
|
| 576 | }
|
| 577 | ```
|
| 578 | <cell-attrs class=ysh-code />
|
| 579 |
|
| 580 | </table>
|
| 581 |
|
| 582 | -->
|
| 583 |
|
| 584 |
|
| 585 | <!--
|
| 586 | TODO:
|
| 587 |
|
| 588 | - change back to oilshell.org/ for publishing
|
| 589 | - Compare to wikipedia
|
| 590 | - https://en.wikipedia.org/wiki/Help:Table
|
| 591 | - table caption - this is just <caption>
|
| 592 | - rowspan
|
| 593 | -->
|